Exploring factors and customer perceptions of airport services: A quantitative textual analysis

: This study conducted an analysis of 1500 user generated content from Google Travel to examine the factors influencing airport services and customer perceptions. Quantitative textual analysis is employed to extract meaningful insights. Our findings highlighted the most frequently used words in airport user generated content, reflecting critical aspects of airport experiences such as the airport itself, the quality of service, and international travel. A cluster analysis revealed five distinct clusters, representing flight operations, location, views, customer feelings, and intangible services. A co-occurrence network analysis showed strong correlations among keywords associated with positive experiences, underscoring the importance of service quality and infrastructure in customer satisfaction. Furthermore, through topic modeling, we categorized words into five distinct groups: airport flights, ground services, international services, customer experience, and location. The practical implications of this study are substantial. The insights can help airport management identify strengths and areas needing improvement, ultimately enhancing customer satisfaction and the overall airport experience.


Introduction
Located in Bangladesh, Dhaka serves as the capital city and the epicenter of all commercial and economic activities in the country.Consequently, Hazrat Shahjalal International Airport (HSIA), the airport that serves the city, is the busiest in the region.As reported by Wadud [1], HSIA catered to 83% of air passengers and 89% of air cargo in Bangladesh in 2010.Moreover, Hazrat Shahjalal International Airport accommodated 18,681,474 passengers in 2019.Although its figures remain lower after the Coronavirus outbreak, it is a rapidly growing airport.HSIA acts as a hub for several airlines: Biman Bangladesh Airlines, Regent Airways, US-Bangla Airlines, SkyAir, Easy Fly Express, and Novoair [2].
Considering Dhaka's significance as the Bangladesh primary international gateway, international passengers account for a substantial part of HSIA's traffic.Among the many facets of the air transportation industry, including airlines and their supply chains, Bangladesh is estimated to generate US $449 million in GDP through the industry.An additional USD $320 million in foreign tourist spending further bolsters the country's economy, bringing the total to USD $769 million.Hence, 0.3 percent of the country's gross domestic product is generated by inputs to the air transport sector and foreign tourists arriving by air [3].Due to the country's strong GDP growth, international travel from Bangladesh has seen a significant surge.Consequently, the construction of a new airport has topped the national political agenda for several years [1].In light of this, a third airport terminal is currently under construction [4].This will eventually increase the passenger as well as cargo handling capacity of the airport.
In this study, we employed quantitative textual review analysis techniques to identify factors influencing customer perceptions and the service quality at Hazrat Shahjalal International Airport.Since airports around the world continue to adopt market-oriented business strategies, service quality is considered the center of airport management.As a result, improved efforts have been made, particularly among the best performing airports, such as the Incheon Airport in South Korea, Changi Airport in Singapore, and others [5].
Despite the wealth of research on passenger satisfaction and service quality within the air transportation industry, studies utilizing online reviews for comprehensive analysis remain scarce, particularly for airports in developing countries.Previous research, such as the work by Adeniran and Fadare [5] and Halpern and Mwesiumo [6], has significantly contributed to our understanding of passenger satisfaction and the impact of service failure on airport promotion.These studies, however, have primarily relied on traditional survey methodologies and have not focused on the rich and diverse insights that can be gleaned from unstructured customer-generated online content.
This gap in the literature is particularly pronounced for Hazrat Shahjalal International Airport (HSIA), the largest airport in Bangladesh, which serves as a critical hub for both regional and international connectivity.Despite its importance, there has been a notable absence of studies exploring passenger satisfaction and service quality at HSIA through the lens of online reviews.This research seeks to bridge this gap by employing a novel methodological approach that combines cluster analysis, co-occurrence analysis, and topic modeling to analyze customer-generated online reviews of HSIA.This approach allows for a more nuanced understanding of passenger experiences, expectations, and areas of concern, offering insights that are not readily accessible through traditional survey-based research.
The choice of HSIA as the focus of our study is motivated by several factors.Firstly, the airport's strategic importance in the region and its role in facilitating international travel and commerce make it a vital subject for study.Secondly, the burgeoning digital landscape in Bangladesh and the increasing reliance on online platforms for service feedback make online reviews an invaluable source of data for understanding customer perceptions.Finally, exploring passenger satisfaction and service quality at HSIA through online reviews not only addresses a significant gap in the academic literature but also provides actionable insights for airport management to enhance service delivery, improve passenger experiences, and contribute to the overall development of the aviation sector in Bangladesh.
By focusing on HSIA and utilizing online reviews, this study offers a pioneering contribution to the field, presenting a novel perspective on airport service quality analysis.It highlights the untapped potential of online customer feedback in generating comprehensive insights into passenger satisfaction and service quality, setting a precedent for future research in similar contexts.

Airport service quality analysis
In previous studies, several perspectives are being explored regarding airport operations and services.Some of the researchers analyzed passenger expectations and experiences, and others studied airport operational efficiency and productivity using various airport performance assessment methodologies.Throughout the service industry, it has become increasingly important to be able to understand the perception of quality of the customer as a whole [7].An empirical study by Fodness and Murray [8], discovered that passenger expectations for airport services are multidimensional and identified three key dimensions: interaction, function, and diversion.Through the in-depth interview and surveys over one thousand passengers frequently utilizing airport services, they developed and empirically tested a conceptual model of airport service quality.This enabled them to propose a set of recommendations on measuring airport service quality [9].
Most of these recommendations focus on evaluating the quality of passengeroriented services.According to Lubbe et al. [10], passengers' opinions are the primary indicator of airline operations, making it essential to analyze passengers' expectations of airport services.Services must be defined and assessed by them.They conducted a study at OR Tambo International Airport (South Africa), utilizing the model proposed by Fodness and Murray [8].Their study examined three areas of airport services: interaction, function, and diversion.
The interaction dimension was evaluated based on the speed of complaint processing, the degree of individual attention provided, and the promptness in responding to inquiries.The functional dimension encompassed parameters related to effectiveness, such as exterior signage, airport service signs, physical layout, a variety of transportation options for accessibility, convenient locations of baggage trolleys, and the availability of connecting flights.The wait times for luggage, registration speeds, and the time taken for passengers to leave the aircraft were also considered as efficiency-defining criteria.Diversion is the third dimension of airport operation assessment, three groups of criteria were used: maintenance (availability of retail outlets, restaurants serving local cuisine, and stores reflecting traditional local culture), dé cor (an environment consistent with local culture and a variety of artistic expressions), and productivity (conference organization services, business centers, quiet areas).The research findings revealed that corporate and leisure travelers have different perceptions about the importance of the services offered and the level of airport operational efficiency.A comparison of the expectations of frequent and occasional flyers regarding airport services was also conducted to understand the differences.The study further established that passengers place a high value on the interaction dimension when assessing airport services [9].

User generated content
The digital age has given birth to user generated contents, which are critical components of service [11].User generated content, based on customer experiences, provide information about products, services, and brands and allow customers to express their opinions [12].Several platforms allow customers to share their service experiences, including Google Travel and Twitter, as well as travel-specific platforms such as TripAdvisor and Skytrax [13].These platforms enable user generated content to be accessible at any time, from anywhere, and have a wide reach [14].
Therefore, User generated content can be credited to customer feedback received via online platforms [15].The use of online platforms enables customers to rate and evaluate the services they have received [14].The growing popularity of these platforms has led to passengers' voices being louder and more influential than ever before benefiting both passengers and airport operator [16].A generic framework was proposed by Kuflik et al. [17] for automating the collection, filtering, classification, and semantic processing of transport-related tweets.As a result of the study, positive emotions tend to prevail in customer generated contents regarding transport expressed on social media.Several of the challenges presented in the paper can be attributed to the development of an automated analysis of social media data for the purpose of aggregating the micro-level social media data regarding transport experiences in social media [18].
Customer generated contents significantly influence the decision-making process of potential passengers since they are deemed objective and reliable information sources [6].For managers, user generated contents provide a wealth of real-time customer feedback data.Consequently, user generated contents present a costeffective, timely, and efficient method of gathering consumer feedback in the airline industry [11].
User generated contents have been utilized as secondary data in various marketing applications, ranging from pricing to brand management [19,20].In recent years, due to the widespread use of Web 2.0, customer generated contents have been proposed as an alternate method to assess Airport Service Quality (ASQ).Passengers are increasingly expressing their opinions on service providers, i.e., airports [13,16].As a result, user generated contents amplify customers' voices, allowing for more effective ASQ strategies to be monitored and developed [15,21].

Quantitative textual review
Text mining software such as KH Coder, available free of charge, can be used in computational linguistics as well as in quantitative content analysis.There is no restriction on the input language, although this work was conducted entirely in English.Data preprocessing involves removing several characters from the text, along with some words, but it is integral to the running of the analysis and the usability of the results [22].However, KH Coder is not able to analyze texts semantically, but it does provide many useful features [23].
By KH Coder, any word in a text can be analyzed in relation to other words, and the results are displayed using visual aids such as co-occurrence networks and hierarchical cluster analysis [24].It can utilize functionalities such as KWIC (Key Word In Context), collocation statistics, co-occurrence networks, self-organizing maps, multidimensional scaling, clustering, and correspondence analysis using backend tools such as Stanford POS Tagger, Freeling, Snowball stemmer, MySQL, and R [25].Song et al. [26] define this technique as an extractive method, not an abstract one since the final result consists of grouped words that are already present in the document.
Research in the area of economics has also made use of KH Coder to classify documents into different categories, which was previously a time-consuming and subjective process carried out by researchers [24,27].Researchers worldwide have highly regarded it and have utilized it in a wide variety of disciplines.These include neuroscience, sociology, psychology, public health, media studies, education research, and computer science.In comparison to WordStat, KH Coder has been reviewed as an easy-to-use application for identifying themes in large unstructured databases such as online user generated content [28].

Data collection
In order to conduct this study, data was obtained from Google Travel.Due to its popularity, Google Travel offers an extensive collection of customer generated content that reflect genuine customer experiences and sentiments.In addition to securing a diverse and comprehensive set of user generated content, the platform's widespread adoption by a wide array of demographics and geographies makes it an invaluable source for academic research [29].Researchers are able to access and analyze data without significant barriers as a result of the transparency and openness of Google reviews, thereby ensuring their accuracy and authenticity.The user generated content contains information regarding its content and rating [30].
The data for this study was collected using Outscraper, an API tool for scraping social media data [30].To gauge customer satisfaction levels, we collected around 3000 user generated content between January and June 2023.After reviewing the reviews, we removed those that did not contain any actual comments or did not contain any important information.Consequently, 1500 reviews were suitable for further analysis.Figure 1 demonstrates the sample of user generated contents.Through the integration of data mining techniques and user-generated content, this study was able to gain valuable insights into airport service performance.

Data analysis
In Figure 2, it shows the flow of this study.To analyze the data in this paper, Kh-Coder is being used.For the analysis, the Excel file containing these user generated contents was imported into KH-Coder.The "Run PreProcessing" command was used to preprocess the file, which resulted in sentence segmentation into words and a database of the results.With the processed data, it generated a frequency list, a cluster analysis, a co-occurrence network, and a topic model of words.

Frequency analysis
A frequency analysis identifies the words that are frequently used in a document.To analyze the user generated contents, we extracted 1500 entries from Google Travel.After running these user generated contents through KH-Coder for frequency analysis, we found more than 1500 unique words.From this refined data, we removed articles, conjunctions, prepositions, and relative pronouns.Also, redundant, repeated, or unnecessary terms were eliminated, leaving only relevant noun and adjective keywords related to airport experiences.Figure 3 illustrates the 100 most frequently used words which include 'airport', 'good', and 'international'.Interestingly, it also contains words such as 'mosquito', 'bad', and 'poor'.Overall, these keywords offer a mixed impression of the quality of airport services.

Cluster analysis
Cluster analyses impose hierarchical structures on samples [31].In KH-Coder, these user generated contents were segmented into five distinct cluster groups, each denoted by a different color.KH-Coder allows the user to exclude unnecessary words from the analysis by using "stop words".We manually added a few words to be excluded from the analysis.These words included "part", "area", "new", "year", "thing", "one", "many", "lot", "much", "available", "other", "main", and "more".
For instance, in Figure 4, the first cluster, colored red, describes the airport's flight operation (FO).'Terminal' is the most significant word in this group, followed by 'construction', 'facility', 'flight', and 'domestic'.Here, 'terminal' is linked with 'construction', while 'flight' is associated with 'domestic' and 'facility'.The first two and the last three words together form this entire cluster.
The second cluster, colored blue, describes the location (L).'Airport' is the most frequently used word, followed by 'international', 'world', 'country', and 'city'.Similar to the first cluster, 'airport' and 'international' form one sub-cluster, while 'country', 'city', and 'world' comprise another.
The third group, in olive color, contains the fewest words, with 'place' and 'nice' describing the view (V).The next group, in magenta, details customers' feelings (F), with 'good' being the top word, then 'service', 'bad', 'passenger', 'experience'.'Experience' and 'bad' are internally connected at the sub-group level, while 'service', 'passenger', and 'good' constitute another sub-cluster.

Co-occurrence network analysis
The Co-Occurrence Network is a method used for creating graphs that display words which frequently appear together, signified by lines, or edges, connecting them (KH Coder, Co-Occurrence Network) [18].In this study, co-occurrence analysis is applied based on ratings.As shown in Figure 5, key words associated with 4-star or 5-star ratings, such as 'good', 'place', 'nice', 'bad', and 'service', exhibit a strong correlation.This implies that customers who gave the airport 4-star or 5-star reviews used these words to describe their positive experiences.
While these words were also frequently used, the word 'airport', despite having the highest frequency, shows no connection with other words.Most high-rated words were associated with the service and infrastructure of Hazrat Shahjalal International Airport, Dhaka.The relationship between the word ratings and frequency indicates that these factors significantly influence customer satisfaction and could be crucial to the airport's success.

Topic modeling
In topic modeling, a researcher iteratively identifies the number of topics until reaching the most descriptive group.Consequently, a number of topics are extracted which best represent the entire document.Researchers use the topic modeling method to repeat a number of topics several times and select the most descriptive group of topics from among the remaining topics.In order to accomplish this, the subject group extracts the number of topics they feel are most appropriate for describing the entire document [14].
Figure 6 illustrates the result of topic modeling.We categorized the topics into five groups in alignment with the cluster analysis, with each category comprising 10 words.In addition to providing a structured method for identifying latent patterns and recurrent themes within large volumes of textual data, these topics can be derived using the KH Coder's topic modelling feature.KH Coder utilizes Latent Dirichlet Allocation (LDA), a statistical model that represents collections of words frequently occurring together.As well as this, the model assigns coefficients to these topics, with a higher coefficient indicating a stronger association or relevance with the dataset.Topics with a higher coefficient are more likely to be central themes or focal points in user generated contents, which can provide a quantitative indicator of their importance in shaping customer perceptions and experiences [32].
In each of these equations, the contribution of individual words to the overall portrayal of each category in the topic modeling analysis is represented.A higher beta value indicates a larger contribution of a word in defining a category.

Discussion
Using KH-Coder, we devised a systematic methodology to understand the user ratings and user generated contents for Hazrat Shahjalal International Airport (HSIA), Dhaka.The frequency analysis revealed the most frequently used words in these user generated contents were "airport", "good", "international", "nice", "place", and "terminal".However, the presence of words like "mosquito", "bad", and "poor" indicated mixed perceptions about the airport's service quality.
We then divided the words into various clusters based on the results of our cluster analysis.In recent years, cluster analysis has become more popular among systematists working on the development of classification systems.Through Cluster Analysis, we segregated user generated content into five clusters: flight operation (FO), location (L), view (V), customer's feelings (F), and intangible service (IS).Each of these clusters represents a characteristic aspect of the airport experience.Further, there are subclusters emphasizing specific connections and themes.
We also conducted a co-occurrence analysis based on ratings, revealing words like "good", "place", "nice", "bad", and "service" were frequently used in 4-star or 5star reviews, indicating a range of experiences.Despite its highest frequency, the term "airport" didn't show strong connections with other words.By using Co-Occurrence words and Principal Component Analysis, we were able to identify important terms as well as the underlying structure of the data without sacrificing significant information [20].
It was possible to identify latent patterns in the content generated by customers by using topics derived from Latent Dirichlet Allocation (LDA).Topics related to airport service, security, and flight operations showed high correlation coefficients, indicating that these topics are important in influencing customer perceptions.According to recent research trends, topic modeling has been used to analyze big data in a variety of ways [14].In this study topic modeling led to the identification of five categories corresponding to the cluster analysis.Each category comprised of ten words describing airport services.Resultantly, the categories of airport flights, on-ground services, international services, customer experience, and location were represented.We also calculated the beta values for each category, reflecting the significance of each word within that category.

Implications
This study offers both theoretical and practical implications.User generated contents expressing customers' opinions allowed better identification of customer needs.Additionally, topical modeling and sentiment analysis have become popular tools for big data analysis, seeing increased utilization in the tourism and hospitality sectors.Through the analysis of customer user generated contents in the airline sector, varied customer feedback can be derived, serving as a valuable marketing foundation for the future [14].
The combination of cluster analysis, co-occurrence analysis, and topic modeling using KH-coder in a single study is relatively rare, In previews studies, they were examined separately to determine customer satisfaction [14].Hence opening up the potential for a new methodological model for research studies.The findings of this study have significant implications for airport industry management.As a result of the study, strategies for improving airport service quality can be developed based on the findings.
From a practical standpoint, the study identifies several categories and factors associated with airport services.Airport authorities can use this information to prioritize areas for improvement.By understanding customer satisfaction levels through their generated contents' keywords, the airport management can enhance the overall customer experience by addressing both positive and negative aspects.

Conclusion
This research offers valuable insights into airport services and customer perceptions through the analysis of user generated contents.The frequency analysis, cluster analysis, co-occurrence network analysis, and topic modeling reveal different aspects of airport services that contribute to the overall customer experience.As Hazrat Shahjalal International Airport caters to a wide array of domestic and international passengers and serves as a major transportation hub in Bangladesh.As a third terminal is coming to operation, Future research should consider a larger sample size and incorporate other methodologies for a more comprehensive understanding of factors influencing airport services.
Additionally, a detailed examination of customer preferences and expectations can help identify specific areas for improvement.A similar study could be conducted in other international and domestic airports in Bangladesh using the results of this study.The findings of this study may also be relevant for airport transportation systems in other developing countries.Ultimately, this study aims to enhance airport services and customer satisfaction in Bangladesh.

Limitations and further study
This study offers insightful findings on customer perceptions at Hazrat Shahjalal International Airport (HSIA) in Dhaka, yet it is important to acknowledge certain limitations that frame the context of our conclusions.The research focuses exclusively on HSIA, which provides a detailed case study but may limit the direct applicability of findings to other airports with different operational characteristics or passenger profiles.While this specificity enables a deep dive into HSIA's customer service dynamics, it suggests a pathway for future studies to explore similar analyses across various airports to broaden the understanding of passenger satisfaction globally.
Additionally, the data collection span from January to June 2023 offers a snapshot of passenger experiences within this timeframe.This period does not encompass the full annual cycle of travel seasons, which could influence service expectations and satisfaction levels.Recognizing this, the study's time frame presents an opportunity for subsequent research to investigate how these factors vary throughout the year, thereby capturing a more comprehensive picture of passenger sentiments and airport service quality.
The methodology employed, primarily topic modeling, was chosen for its ability to distill significant themes from customer-generated content.While this approach yields valuable insights into the prevalent topics of discussion among passengers, it inherently involves a level of interpretation.Future studies might enrich these findings by incorporating sentiment analysis, offering a dual lens through which to view not only what passengers are discussing but also the emotional tenor of their feedback.Such complementary analyses would deepen our understanding of passenger experiences, blending the thematic with the emotive for a fuller picture.

Figure 4 .
Cluster analysis of the user generated contents.

Figure 5 .
Figure 5. Co-occurrence result based on ratings.