Elsevier

Appetite

Volume 55, Issue 3, December 2010, Pages 522-527
Appetite

Research report
Collecting accurate secondary foodscape data. A reflection on the trials and tribulations

https://doi.org/10.1016/j.appet.2010.08.020Get rights and content

Abstract

In a special issue of the American Journal of Preventive Medicine (2009, 36(4S)), themed around the measurement of food and physical activity environments, Brownson, Hoehner, Day, Forsyth, and Sallis (2009, 118) made a plea for increased attention to be paid to the difficulties and complexities associated with the gathering of secondary data, and its subsequent refinement for use. Some of the peculiarities involved with the gathering and refining of secondary data, in particular data on the locations of food outlets in North East England are discussed in this paper.

‘Foodscape’ data is often invoked in Geographical Information Systems (GIS) based studies that seek to explore the geography of food availability/access in relation to outcomes such as obesity. However, results from a GIS-based analysis are only as strong as the data utilised. This paper explores the time consuming negotiations, possible expense and probable stress of acquiring foodscape data from a robust source (Local Councils within the North East Government Office Region (GOR), UK), considerations that may unfortunately influence the potential scope of research projects. Furthermore, this paper extends its remit to discuss the clerical issues that plague the ‘tidying up’ of such secondary information.

The paper will conclude by discussing how the time intensive sourcing and subsequent ‘cleaning’ of accurate secondary information is likely to be worthwhile, but will note that it is naïve to assume that (a) ‘gatekeepers’ will understand the necessity of your research and will thus cooperate accordingly, and (b) that the use of secondary data exonerates the researcher from ‘getting their hands dirty’. The paper also concludes by highlighting the disconnect between the high quality research that is so frequently called for, and the lack of robust data sets that are available for use in these investigations.

Introduction

Obesity is an increasing global concern, with the recent obesity ‘epidemic’ resulting in 65% of men and 56% of women being currently overweight (Body Mass Index (BMI) >25) in the UK alone (Zaninotto, Wardle, Stamatakis, Mindell, & Head, 2006). Modelled future trends suggest that UK levels of obesity (BMI >30) could reach 60% and 50% for men and women respectively by 2050 (Foresight, 2008). Whilst the true aetiology of obesity remains the subject of intense research, the environmental determinants of obesity are receiving particular attention; the ‘obesogenic’ environment is defined as “the sum of the influences that the surroundings, opportunities, or conditions of life have on promoting obesity in individuals or populations” (Swinburn & Egger, 2002). One important element of this multi-faceted obesogenic environment is that of the availability and access to food. At its most simple, obesity is ultimately about energy in (consumption) vs. energy out (expenditure) (Swinburn, Egger, & Raza, 1999), and thus we must consider the food environment as it may influence our opportunities to consume and our food choices.

The ‘foodscape’, or the food environment, incorporates all opportunities to obtain food within a given region (Lake et al., 2010, Townshend and Lake, 2009). Examples of such vendors range from established institutions such as restaurants and takeaways to those such as mobile food vans and vending machines. In the context of many epidemiological studies, especially those that seek to examine the determinants and spatiality of obesity (Black et al., 2010, Jeffery et al., 2006, Mobley et al., 2006, Morland and Evenson, 2009, Spence et al., 2009), accurate information pertaining to the reality of the ‘foodscape’ is essential.

In such studies, the use of Geographical Information Systems (GIS) to map and spatially analyse food availability is becoming commonplace, but as with other quantitative and inherently objective techniques of this ilk, such as any statistical analyses, the results of a GIS endeavour are only as strong as the data used to drive it. As Matthews, Moudon, and Daniel (2009) emphasise, it is easy to fall into the “garbage in, garbage out” trap. In other words, when foodscape data is shown to be inadequate or incomplete, the ramifications for the findings of the research are potentially grave. For example, an individual's access/exposure to food can be easily mis-represented if incomplete data sources are used.

Many sources of food outlet data have been used in the literature to date, including food store websites (Apparicio, Cloutier, & Shearmur, 2007), food marketing research companies (Austin et al., 2005, Kipke et al., 2007, Li et al., 2009Moore and Diez-Roux, 2006, Powell et al., 2007, Zenk and Powell, 2008), governmental departments of health and sanitation (Kwate et al., 2009, Lake et al., 2010, Morland and Evenson, 2009, Morland et al., 2002, Pearce et al., 2007, Pearce et al., 2008, Zenk et al., 2005) (environmental health departments, in the UK), business listings/commercial directories (Maddock, 2004, Reidpath et al., 2002Simmons et al., 2005, Sturm and Datar, 2005), and infrequently, the US Economic Census (Mehta & Chang, 2008), but little discussion has been made of the accuracy of these datasets. A recent paper by Lake et al. (2010) compares the accuracy of three such food environment data sets in the UK – the Yellow Pages, the Yellow Pages online (Yell.com) and data from the environmental health department of Newcastle City Council – and found that data from the local council came closest to the ‘gold standard’ of field validated reality (86.3% agreement). Therefore, it is believed that local council data provides the most accurate and publicly available depiction of the UK food environment reality (Cummins and Macintyre, 2009, Lake et al., 2010).

Available to the public under the freedom of information act (FOI) (Lake et al., 2010), databases of food outlet locations are upheld by local councils in order for them to remain abreast of food availability within their respective jurisdictions and subsequently facilitate hygiene standards inspections. Data are widely available geographically, and are the most complete record of food availability available (Lake et al., 2010), as businesses are required to register themselves with their local council by law (Food Standards Agency, 2010). This necessity for registry extends even to mobile food vending premises (such as ice cream vans) and market stalls, both of which are deemed to require further study as contributing factors to the obesogenic environment (Odoms-Young, Zenk, & Mason, 2009), yet are omitted from the records of the Yellow Pages and Ordnance Survey (OS) Points of Interest (POI). Infrequently accessed sources of food such as pharmacies and some clothes shops are also accounted for in this dataset, adding to its utility in representing the holistic foodscape. Business names and addresses are recorded, allowing subsequent geocoding with a GIS. As the onus is on food vendors themselves to notify councils when/if they change ownership or cease trading, the most likely limitation of this dataset is its temporality (Lake et al., 2010).

This paper is divided into two sections: firstly, as part of work towards a larger research project, this paper presents an account of the ‘trials and tribulations’ undergone when trying to obtain data from this most esteemed and accurate source of secondary food environment data. Through this case study of collecting secondary foodscape data in North East England, this paper will argue that there is a disjuncture between the types of high quality research that are demanded in this field and the datasets that are available to facilitate such research, even when such data is in fact in existence.

The second part of this paper will comment on the peculiarities involved with and the resources required when trying to refine this particular type of data into a comprehensible and usable form. It is often assumed that the use of secondary data (and data on the foodscape is no exception) is an easier alternative to the gathering of primary data when conducting research. Indeed, the use of secondary data is often described as ‘more efficient’, ‘less costly’ and ‘more timely’ than using/collecting primary data (Boslaugh, 2007, Stewart and Kamins, 1993). Whilst issues surrounding the contemporaneousness of secondary data, and the fact that secondary data has usually been gathered for another purpose prevail (Boslaugh, 2007), thus meaning that we should not see the use of any secondary data as problem free, one might assume, as Stewart and Kamins (1993) describe above, that secondary data represents a more straightforward alternative to the purposive gathering of primary data. Implicit in this assumption is the notion that such data is received in an eminently useable state, thus saving time and energy for the researcher (Finnegan, 2006). This is seldom the case in reality. Whilst problems with secondary data are not restricted to food environment measurement only, in a special issue of the American Journal of Preventive Medicine themed around the measurement of food and physical activity environments, Brownson et al. (2009) made a plea for increased attention to be paid to the difficulties and complexities imbued in the ‘streamlining’ of secondary food environment data in particular prior to use; this paper is one such retort. Throughout, this paper aims to give practical and pragmatic advice on how to approach the data collection process for accurate foodscape data in the UK context. This said, it is believed that there are lessons to be learnt and techniques detailed that would be useful outwith this particular country setting.

It is argued that the implications of such data collection restrictions and refinement complexities is inevitably the use of inferior foodscape data in obesity research, with its implicit and sub sequential consequences for research findings.

The aim of this study was to examine the extent to which it is feasible to obtain and refine foodscape data for an extensive geographical area, for use in further research. Foodscape data was to be sourced for the North East of England, namely the North East (NE) Government Office Region (GOR), an area covering 8,676.42 km2 in its entirety (Office for National Statistics, 2005) (Fig. 1). The NE GOR is constituted of 23 local authorities of varying size, each being served by their respective local council (Fig. 1).

Councils were initially contacted in February 2009 through the Association for Public Service Excellence (APSE), whereby an email was sent to each local council, from us, upon instruction, by APSE (actual email not shown). It was believed that making contact with the councils through a recognised body such as APSE would yield the greatest response rate. A list of all active food outlets in the council region was explicitly asked for, and a two-month maximum compliance period was requested; 9/23 councils had sent their data within this initial deadline, despite the fact that councils are obliged to respond to FOI requests within 20 days (Department for Constitutional Affairs, 2007). A reminder email (which resulted in no further responses) was sent at this point before the remaining 14 councils were contacted by phone in June 2009 (Fig. 2). Conversing with the leading environmental health officer in each council, in person, was the final step in obtaining food premises register data for the entirety of the North East – a process that took nearly seven months. It is evident from Fig. 2 that contact made over the phone (asking for the head of the environmental health department) appeared to elicit the most responses, whilst the reminder email proved fruitless. It was however, far from an easy process, not least because of the frequency with which environmental health officers are ‘out of office’ and therefore not contactable. It paid to be insistent with your requests when you were finally able to converse with them, negotiating compliance within a mutually agreed period of time, or asking for an informed alternative contact, for example.

Initially however, many councils were hesitant about disseminating their data (despite the fact that it should be freely available to the public under FOI) until they were told of the cooperation of other councils – reticence that would have been non-existent should they have been fully aware of the guidelines surrounding the distribution of such data. Two councils initially cited ethical issues entwined with the dissemination of this data as reasons for non-compliance, however, they eventually conceded this position.

Several councils attempted to levy a fee for the preparation of the requested dataset – the following email response was typical:

“Thank you for your request for information on food outlet data. Your request was received on 30th June and I am dealing with it under the terms of the Freedom of Information Act 2000.

In some circumstances a fee may be payable and if that is the case, I will let you know the likely charges before proceeding.” (Anon., received 30/6/09).

Although councils are entitled to mandate a ‘fees notice’ under the FOI act (Office of Public Sector Information, 2000), eventually most councils were appeased upon hearing that others had cooperated without charge. The speed with which some councils were able to respond to the original FOI request would suggest that accessing this information is a relatively simple and quick process, at least in most cases. Thus, as a researcher, it seems legitimate that such dissemination would occur without charge. Despite this, the gatekeeper at one council was especially resolved to non-compliance, stating that:

“The information requested can be provided at a cost of £52.86 plus VAT for up to 2 hours work for a simple request and £52.86 + £25.00 per hour plus VAT for more complex requests. It is estimated that preparation and collation of the data will take 1 hour. I will be obliged if you could forward a cheque for £60.79 (inc VAT) made out to ‘XXX Council’ if you wish us to proceed” (Anon., received 5/8/09)

However, the efficacy and importance of persistence when attempting to source secondary data cannot be underestimated and the data was eventually obtained free of charge. Due to budgetary constraints, this was essential in this research.

Four councils were unable to transmit electronic copies of their data, necessitating the data being posted and then typed manually into a spreadsheet (n = 1876 records). This further hampered the speed of the data transmission process. One particular council was also unwilling to distribute data via email, highlighting instead the existence of their data on the ‘Scores on the Doors’ website (http://www.scoresonthedoors.org.uk/), a recent repository of food outlet information upkept by local councils – although at this time (18/12/09) only available for six councils within the study area and only 120/354 councils nationwide. Unfortunately, this data had to be manually transferred to a spreadsheet, too (n = 721 records), a resource consuming complication that sadly could not be overcome through an electronic transmission of the council list, even after discussion with the council in question.

To complement these difficulties, the classification systems utilised to categorise the data appeared to vary between councils, requiring that the data be re-classified in a more suitable, rigorous and uniform way at a later stage (using a recently developed classification system by Lake et al. (2010)).

The data that was eventually received was also often, far from complete (Table 1). Postcodes were frequently absent from the information received (n = 1842) and needed to be sought out using a combination of Yell.com, thephonebook.bt.com (the online British Telecom (BT) phonebook), Tyneside.com and importantly, Google.co.uk. Postcodes were most frequently missing from day centres, schools, nursing homes and village halls in particular. The exact locations of mobile street vendors and market stalls were particularly problematic, principally because they are not bound to a single location and are essentially free to locate anywhere. Ultimately, mobile food vendors were removed from the dataset completely (n = 433, 2.1%), as the addresses and postcodes given were often for the owner of the vehicle and not the retail location proper (although it was impossible to discern when this was the case). The absence of market traders, whose addresses could not be identified, is thought to be of little hindrance as these food vendors usually wield a transient and fleeting presence at best. Although many postcodes (required for the representation of street addresses in a GIS) were still missing from the overall dataset after an exhaustive online search (n = 373), they constituted a small percentage of the total number of records (1.8%) and so this is still believed to represent one of the most complete and reliable sources of food outlet data possible. It is also possible that missing postcodes that could not be found belonged to businesses that are no longer in operation. Finally, it was not uncommon to find duplicate records within the data and vigilance is thus imperative – data from one council was especially problematic and repeated records (n = 154) had to be painstakingly removed.

Food outlets had been classified by environmental health departments in a number of disparate ways; some had adhered strictly to Food Standards Agency (FSA) classifications whilst others had not. Data from five councils arrived with no classification system applied whatsoever, despite this being explicitly requested and verbally acknowledged. Subsequently, the uniform reclassification of all food outlets was necessitated in order to complete the clerical ‘tidying’ of the dataset. The work of Lake et al. (2010) provided a recently developed and robust 22-point classification system, which was then modified to suit the needs of this enquiry, into a 20-point classification pro-forma. Food outlets were to be demarcated as one of the following types: Bakers – Retail; Café/Coffee Shop; Convenience Store; Department Store; Discount Store; Entertainment Venue; Fast Food; Health and Leisure Facility; Hotels/Function Rooms/Associations; Non-Food Stores/Novelty Items; Pharmacy; Pizzeria; Pub/Bar; Restaurant; Sandwich Shop; Specialist; Specialist Traditional; Supermarket; Takeaway and Unclassified. A 20-point classification framework was thought sufficient enough to allow a detailed and nuanced appreciation of the variety of food outlet types existing in reality (Lake et al., 2010).

With the aid of numerous Google (google.co.uk) and Yell.com searches, it was possible to arduously categorise all valid food outlet records (n = 18,027) with only the name of the business known a priori. The removal of mobile food retailers, wholesalers, caterers, places of work/education and those for whom a postcode could not be found amounted to the removal of 5304 records, 26.2% of the data initially obtained from councils. It is worth restating that this dataset theoretically remains the most complete record of food that is available for purchase by the public – only those for whom a postcode could not be found, or those for whom a postcode would be inaccurate, are missing from the final dataset. Only 349 outlets (1.9%) fell into the ‘Unclassified’ category, with insufficient information available on the Internet to facilitate an accurate categorisation. It is worth noting however, that in general there existed a systematic bias towards discovering more ‘Unclassified’ food outlets in rural areas; information on the types of food vended within these retailers was more frequently untraceable. It is also worth noting that such issues of reclassification would be moot if local councils had coded their data to a fixed standard, such as that offered by the Food Standards Agency (FSA).

Section snippets

Discussion and recommendations

The process of obtaining foodscape data was difficult, and intensive in terms of the time and work commitment that it required. This said, there are several observations to be made, and some lessons to be learnt that may streamline future data collection.

It became clear that the best course of action was to establish and maintain contact with lead environmental health department officers verbally, over the phone rather than by email, as is evidenced here by the fact that this method of

References (41)

  • L.M. Powell et al.

    Association between access to food stores and adolescent body mass index

    American Journal of Preventive Medicine

    (2007)
  • D.D. Reidpath et al.

    An ecological study of the relationship between social and environmental determinants of obesity

    Health and Place

    (2002)
  • R. Sturm et al.

    Body mass index in elementary school children, metropolitan area food prices and food outlet density

    Public Health

    (2005)
  • B. Swinburn et al.

    Dissecting obesogenic environments: the development and application of a framework for identifying and prioritising environmental interventions for obesity

    Preventive Medicine

    (1999)
  • T. Townshend et al.

    Obesogenic urban form: theory, policy and practice

    Health and Place

    (2009)
  • S.N. Zenk et al.

    US secondary schools and food outlets

    Health and Place

    (2008)
  • P. Apparicio et al.

    The case of Montreal's missing food deserts: evaluation of accessibility to food supermarkets

    International Journal of Health Geographies

    (2007)
  • Association of Public Health Observatories

    Health profile 2008

    (2008)
  • S.B. Austin et al.

    Clustering of fast-food restaurants around schools: a novel application of spatial statistics to the study of food environments

    American Journal of Public Health

    (2005)
  • S. Boslaugh

    Secondary data sources for public health: a practical guide

    (2007)
  • Cited by (0)

    The author would like to acknowledge the local authorities in the North East of England who provided the data for this study. Thanks also for the contributions of Dr. Amelia A Lake (Northumbria University, UK) and Dr. Seraphim Alvanides (Northumbria University, UK) who read and commented on drafts of this work. Thomas Burgoine is currently funded by an ESRC quota PhD studentship.

    View full text