Sep 092015

During the final stages of the County Surveys project, we are shifting our attention to the process of digitising some volumes.  From the outset, a key aim of the project has been to scope the resources required to bring a full set of the county surveys together in a convenient digital format. The creation of the online bibliographic search tool was the first step towards this long term aim, as it allows us to assess the potential of printed books for digitisation and the quality and access conditions of extant digitised copies. A second step was to carry out trial digitisations of a few key surveys. As we have noted elsewhere,  this is a pilot, through which we will explore the potential requirements of a fuller, high quality, full text online collection.

Using the information surfaced through the bibliography, we identified a number of candidate surveys and discovered their locations. We found that one of the rarest volumes was held right here in Edinburgh, in the collection of the Royal Botanic Gardens Edinburgh, who kindly allowed us to work with their copy and use their state of the art digitisation equipment. The work was carried out by Phil Mellor, currently a PhD student at the University of Strathclyde, who did a great job of thoughtfully documenting the requirements and exploring different options for carrying out each step of the process.

The workflow that Phil sketched out  involved four stages: locate, capture, edit, OCR.

The first step was to locate the material: this involves finding an accessible volume (which may be easier said than done in some cases), assessing its binding, checking for folded plates and other potential issues such as uncut pages. We chose two volumes: the General View of the Agriculture of the Shetland Islands (1814) by John Shirreff, which is a revised survey and has 228 pages; and the General View of the Agriculture in the Southern Districts of the County of Perth (1794) by James Robertson, a first series survey with 140 pages, bound with other volumes and featuring the long ‘s’ (historically used where modern English uses a double s).

The capture stage, which we were expecting to be the most time-consuming, turned out to be fairly quick. Using the RBGE equipment, two pages at a time can be photographed and once the initial set-up is done, the cradle and cameras remain in place: Phil was able to capture an entire volume in under an hour.   During the capture stage, we had the choice of creating RAW files or JPEG.  There is an advantage to using RAW, as having a loss-less format from the outset enables editing from a high quality original at any point in the future. However, we also found that the quality of JPEG produced was high enough for good OCR and the JPEG files were easier to edit and quicker to upload and transfer.

The next stage of the workflow was editing. After saving the page images, they were uploaded into the editing software provided with the the capture software and the skew of the pages amended, cropping where necessary. Skewing is an inevitable result of capturing a bound book, in which the pages will always be presented at an angle that increases and decreases depending on the page at which the book is opened. This can be addressed during the capture process by changing the camera angles, or more quickly during the editing phase. Editing can be done on a page-by-page basis or on a chapter-by-chapter basis, and we found the latter to be sufficient for our needs. This stage could be quite time-consuming, depending on the quality of image required and the quality of images captured, but Phil found that editing a whole volume on a chapter by chapter basis took around two hours.

The final stage of the process was OCR-ing the images to produce text files. We tried a couple of different software packages, including the open-source programme Tesseract and a proprietary software called ABBYY Fine reader. Tesseract performed well and we were able to produce searchable pdfs and text files quickly and easily. However, it struggled with the historical print and the many unique Scottish names found in the surveys. ABBYY handled these comparatively well, and also automatically formatted the text to mimic the page image which saved a considerable amount of editing work. Overall, then, the digitisation process was quicker and smoother than we had envisaged, which bodes well for future projects to complete the collection.

The above gave us a workflow which we would use as a template for any future digitisation. However, for comparison’s sake we digitised two further volumes with equipment kindly made available to us by Edinburgh University Library Centre for Research Collections. Using this, we digitised two volumes: a different copy of the 1814 John Shirreff survey of The Shetland Islands and the survey of Ayrshire from 1811 by William Aiton. It might, at first seem strange to do the Shetland survey again but this allows us to do a direct comparison both in term of process and quality of result. The conclusion, in terms of the process was that the equipment at RBGE made the process considerably quicker and easier. Specifically, the most important characteristics of the scanner were:

  • The ability to capture two pages at once reduces the time considerably.
  • A cradle which does not require the book to be flat is both quicker and probably safer in terms of handling fragile and older books.
  • An integrated editing software which allows for corrections of cropping and skewness to be applied at the same time makes the workflow smoother.

The comparison not only gave us some more material which we can make available online but also confirmed that the process we outline above is a good way to proceed.

A final note on the Ayrshire survey: when this book was retrieved from the archives, the pages were still uncut. The library staff were able to assist in opening the book up but there is something which gives pause for thought: this was the first time this book had been read, and it was being read by an Optical Scanner for online processing, a ‘reader’ unimaginable to those people whose effort went into producing it.

The digitised volumes can be found within our service and below:

General View of the Agriculture of the County of Ayr, by William Aiton , 1811

General View of the Agriculture  in the Southern Districts of the Country of Perth, by James Robertson, 1794

General View of the Agriculture of the Shetland Islands, by John Shirreff, 1814


Sep 042015

We’re delighted to report that the County Surveys of Great Britain project has been attracting some media attention over the last few months.

magazineIn May, professional family historian and author Chris Paton wrote about the project in a post on the British Genes blog, a wonderful and widely-read resource for those interested in genealogy and local history.

In August, there was a brief write up in the genealogy magazine, Your Family Tree, who picked up on the fact that the surveys could be very valuable in providing context to those researching their ancestors.

summer 14 botanics cover list sizeThe project will also feature in the autumn edition of Botanics, the magazine produced by the Royal Botanical Gardens Edinburgh, which will soon be available for download from their website.

Jul 272015

Some of the significance and much of the character of Sir John Sinclair’s ‘great pyramid’ comes from the many authors involved in reporting and writing up the surveys.  In the case of the  Statistical Accounts of Scotland, Sinclair drafted in local ministers to describe their parishes. Knowing their parishioners intimately, these men of the cloth were able to answer detailed questions about the place and the people, and frequently gave their individual opinions and perspectives on local tales, customs and morals.

The authors of the County Surveys, in contrast, were not of one profession or social position. The surveys were commissioned  from a wide range of  ‘intelligent gentlemen’, including university professors, farmers, landowners, clerics, professional writers, and political activists. Moreover,  it was planned that “every farmer and gentleman in the district” would have the opportunity to read and remark on the first series, which would be revised to incorporate all their insights before final publications in the second series. It was, in other words, to be a collective undertaking by many hands, designed to provide the board with “a greater variety of information and a greater mass of instructive observations from a greater number of intelligent men for their consideration and guidance.”* The incentive for such men to give up their time and energy was not financial, indeed several of the surveyors worked for free and most claimed only their expenses. Rather, they worked in the name of the public good and in the belief that their undertaking would be of significant value to their nation and its people.


Arthur Young, 1741-1820

While the stories of many of these contributors are lost to history, a few  were historically notable individuals. The Reverend Dr. Walker, for example, who surveyed the Hebrides was Professor of Natural History at the University of Edinburgh. A distinguished scientist with interests in botany, mineralogy and geology, and a pioneer in the study and teaching of agriculture, he had conducted exploratory tours of the Western Isles on behalf of the Board of Annexed Estates in the 1760s and 70s: a more suitable candidate for surveying these counties for the Board of Agriculture would be hard to imagine.  Where Walker was a pillar of the establishment, Charles Vancouver was a more colourful figure. Like his older brother the explorer George Vancouver (who famously charted the Pacific Coast of North America in the  early 1790s, and after whom the Canadian city Vancouver is named), Charles was a traveller and frontiersman in the American colonies. Of Dutch origin, and originally a farmer, he had spent decades working the land and writing about ‘natural philosophy’ in newly-settled Kentucky, before returning to the UK in the early 1790s. He would later work in the Netherlands, before returning to the Americas, using his ‘practical expertise’ in cultivation and farming to support himself.  Vancouver’s friend and secretary to the Board, Arthur Young, was also an author and completed the survey for Suffolk. Young began his career in a mercantile house, but was more interested in travel, literature and politics than commerce. The author of four novels, pamphlets, magazines, and a number of travelogues, he was also interested in experimental agriculture and in the rights of agricultural workers. Although his experiments did not produce revolutionary results, as an astute as a social and political observer “he remains the greatest of English writers on agriculture.” (Higgs, Dictionary of National Biography, 1885-1900, Vol. 63 p.362 )

In the combined wisdom of such fascinating, experienced and erudite writers, supported by the numerous contributors whose names are lost to posterity, the county surveys offer us insights not just into the agriculture of the time but also into the intellectual milieu and social conditions of Romantic Britain.

*all quotations in this paragraph are from Appendix G of Sinclair’s  1797 Communications to the Board of Agriculture, on Subjects Relative to the Husbandry and Internal Improvement of the Country, Volume 1. p. xlviii-xlix.


Jul 162015

One of our key aims in building the interface for our collection was to allow people to explore and “play with” the data. It’s hard to get a sense of the extent of the series and the relationships between the surveys without some kind of overview: once you can see the surveys all together and look at them in different ways, it’s much easier to grasp their logic. So we wanted a tool that would aggregate all of the information we have gathered and then allow people to look at that information in flexible ways, to filter and explore it according to their interests.

Flexibility was also a priority in technical terms: we’re making this data available for the first time in this format, so we are aware that we don’t really know what people will want to do with it. We don’t see what we have done with the demonstrator as being the last word but rather the first. Based on this, we can start to understand the data better and start to understand how people might want to access it.  We expect to have to adapt the data and the ways of accessing it as we go along and we learn what we can most usefully provide to the community.

The Data

The process of gathering data has been described in another post, but from the demonstrator’s point of view what was important was to try to keep things as general and adaptable as possible. Nevertheless, this kind of historical data presents certain peculiarities and challenges. One of the most obvious is how to present the survey data. The surveys are arranged by county but the counties that were used are not the counties as they are today. Indeed, the counties used in the first and second phases of the county surveys are not the same. So we needed a mechanism which would allow people to make sense of the data without being restrictive. We’ve achieved this by providing a canonical list of counties taken from Ordinance Survey Data from the early 19th century. We then map this to the actual counties as surveyed. There’s not a perfect match here but we take a “permissive” view of the data – we’d rather show you slightly too much than too little. So the user gets presented with the canonical list in the search facility and we then map that to the county data to decide what to show. The same holds for the author data. We hold a canonical list of authors and map these to the real authors. This allows us to adjust the data in future as we discover more about it.

The Data Model

This mapping then gives rise to the data model. We have surveys which have a county associated with them. Then we have a list of counties which we present to the user which may map to more than one of the underlying counties. That can get a bit confusing but if we look at an example, it becomes clear. If we want to look at the surveys for Shetland then in the filter list we have “Zetland or Shetland” which is how it is listed in the Ordinance Surveys. In the first phase of the surveys, Shetland was included under “Northern Counties and Islands” but in the second phase it has a survey of its own. The implication of this for the data model is that we have to have a one-to-many mapping from entries in the search list to the entries in the surveys. In fact, the same county survey might appear under more than one search term e.g. the first phase “Northern Counties and Islands” needs to appear under Shetland, Orkney, Caithness and Sutherland. So we have to have a many-to-many mapping between the search counties in the interface and the counties as specified in the surveys themselves. To do this we adopt the standard database approach of having a mapping table i.e.ccounty_county

So ccounty is the list of counties as it appears in the search list and county is as they appear in the surveys and the mapping table allows us to relate these two to each other in any way we want. Each Survey can have many publications and each publication can be held in multiple places. This explains why we have separated out surveys from publications from holdings in the data model.

database schema

Database Schema (click to open in new tab)

This model might seem a little complex but it gives us a great deal of flexibility in how we handle counties and authors and makes it fairly easy to add new information about publications and holdings as it becomes available to us.

The Technology Chosen

In line with the ethos of flexibility, we decided to work with standard technology components. At the back end is a relational database. Sitting on top of that is a Web Application built using a standard MVC framework. This approach has advantages in terms of the flexibility but also in terms of getting up and running quickly. The MVC approach (Model-View-Controller) separates out the storage of the data (the Model) from the logic of the application (the Controller) and how the data is displayed (the View). This means that changing one part of it has less impact because it is isolated from the other components. A good example of this flexibility is the change we made to the interface which was covered in a previous post.

The MVC approach to web applications is one of the standard development techniques for web applications these days and when it comes to implementing this you have a wide choice of languages and MVC systems. In our case, it’s all written in Perl using Postgres for the DB with a Catalyst Application on top. So the application takes the standard Catalyst approach of using DBIx::Class to implement the Model and interface to the database and Template Toolkit for the front end. The choice of specific MVC implementation doesn’t matter so much – there are plenty to choose from! It’s really the flexibility this approach gives which is the main thing. Using standard technologies gives us the adaptability we need to be able to do this easily, so that we can get the data available and we can adapt to whatever changes come out of that down the line.

Evolution by Use

So this demonstrator gives people access to look at the data. We’re hoping people will find it helpful in “playing with” the data. But it’s very much the first draft. We expect it to evolve over time as we and any one else interested in the Surveys gets to know the data better and we start to understand more about how to make this data available to people.

Jul 022015

Palm_House,_Royal_Botanic_Garden_EdinburghOver the last few weeks we have been working in partnership with the Royal Botanic Gardens Edinburgh, who hold an excellent collection of County Surveys as part of their impressive collections. The RBGE is currently in the process of having their rare books comprehensively catalogued by the Rare Book Cataloguer from the Centre for Research Collections (CRC) at the University of Edinburgh, and we are pleased to be able to contribute to this process by assisting in the cataloguing of the County Survey holdings. Once they are complete, we hope that these new electronic records will from the basis of another data set for our online demonstrator.

The RBGE also has state of the art equipment and digitisation specialists in house: although they are currently involved in an extensive project to digitise specimens from the internationally renowned herbarium, staff have generously shared their knowledge and allowed us to use their equipment to digitise a few of the surveys. We are pleased to report this work is going very well and we should be able to make the digitised copies available soon, so watch this space.

Jun 242015

We are delighted to announce that our bibliographic search tool is now live and accessible from the ‘Search‘ tab in the menu above.

Our demonstrator includes bibliographic data from some of the best collections of the surveys and, where possible, provides links to library catalogue entries and  digital editions. Researchers can search by modern county name, by series, by county and by author. Results are presented in a new tab after each search, so that you can compare multiple search results by toggling between pages. There are also detailed analyses of collections, revealing the extent of holdings and coverage, and indicating which surveys would be needed to complete each collection.



We hope that the demonstrator will be a useful finding aid and discovery tool for those interested in the County Surveys, the history of statistical reporting and British history more broadly. We would welcome any feedback on the tool, and would be very keen to hear about how it is used or whether it could usefully offer other features and information. If you have ideas, please get in touch with us at

Jun 092015

In a recent post  (‘Who Read the County Surveys?’) I wrote about the insights that book reviews can give into historical reception and reading practices. Another interesting way of exploring reception is through researching the price of a book: for the amount that booksellers charge can give clues not just to the perceived value of the text but also the levels of disposable income available to the target markets.

The prices of the County Surveys varied, usually between 7 and 12 shillings, when they were sold on boards (this was common at the time, purchasers would then arrange for binding according to their own tastes and budgets). Using the great calculators provided on the brilliant Measuring Worth website we can see how much this equates to in today’s money (2013 is the most recent data available), as well as how it compares to the average income of the time and the labour costs of the time.

The surveys were published between 1794 and 1817, so let’s use the year 1806 in the middle of the range, as our point of comparison.  Seven shillings in 1806 equates to a real price in 2013 of £24.77. Twelve shillings equates to a real price of £42.46.  On this information, the surveys seem to be priced fairly reasonably, not particularly expensive although one would not call them cheap. This apparent affordability may be deceptive however: for, in order to really benefit from the instructive comparisons between counties that the surveys were intended to reveal, purchasers would have to buy multiple volumes.  Moreover, the real price really only indicates the relative cost of the volume, and must be read against the incomes of the time.

The average male agricultural worker in 1806 earned somewhere between £24 7s and £38 7s* per year. Let’s base our calculations on the lower end of the spectrum. There were 20 shillings to the pound, so £24 7s was 487 shillings per year, or 40.5 shillings per month: so 7 shillings is roughly 17% of the average workers monthly wage.  The contemporary income value of £24 7s is £26,710. This gives a monthly wage of £2225.83. 17 % of this is £378. This changes the picture quite significantly, suggesting the relative value of the book to a worker is much higher than the ‘real price’.  Would you spend £378 on a single book? What kind of person would have the means to do that?

We know that ‘improvement’ was the pursuit of landowners and that—notoriously in the case of the Scottish clearances—changes could be instituted at the expense of smaller tenant farmers. The figure of £378, which is for one volume rather than a set, suggests that the practical knowledge that the set of Surveys represents was only really affordable only to the relatively wealthy, rather than common agricultural labourers who likely could not have afforded the books. It thus raises interesting questions about the politics of Enlightenment improvement. To explore this further, it would be very interesting to research other reading contexts such as borrowing books: were the surveys acquired by libraries of the time (such as Innerpeffray for example), and did their members borrow the volumes?

As Measuring Worth is at pains to point out, establishing value is far from an exact science and involves subjective interpretation and, in the case of historical values, there is obviously some speculation involved. I think these figures are interesting none-the-less, and although they do not lead to reliable conclusions, they do give a bit of a sense of the historical circumstances in which the Surveys were produced and consumed.


*This figure comes from taking the average in 1832 (the closest historical match I’ve found, from this paper by Gregory Clark, University of California, and using the measuring worth calculators to get the comparable wage for 1806.

May 262015

In a previous post, I mentioned that we are currently reviewing what information users will be able to see in the results page produced by searching our bibliographic database. The current fields displayed are country, county, author, phase and publication date. The most obvious omission here is of course title. However, for a number of reasons, it’s been impractical to include the titles during development and may not be practical in the online tool. This post explains why, and outlines some of the challenges presented by the survey titles.

A generic title page for the County Surveys

A generic title page for the County Surveys

Firstly, there is the issue of length. Some of the survey titles extent to half a page, and if they were shown in full, they would significantly limit the number of results that could be shown on one screen. In addition there is the issue of repetition, as most follow a generic format. Here, for instance, is one typical title:

General View of the Agriculture of the Hebrides, or Western Isles of Scotland: with observations on the means of their improvement, together with a separate account of the principle islands; comprehending their resources, fisheries, manufactures, manners, and agriculture. Drawn up under the direction of the Board of Agriculture. With several maps.

Like all the other titles, it begins with the generic form ‘General View of the Agriculture of… with observations on the means of their improvement….’ Listing many titles in such a form is potentially confusing visually and means that a reader has to work harder than usual to scan and identify the different content. It also means that presenting a shortened title is difficult without reducing the title to the county name. In which case, why not simply list country name? This is what we have done throughout the development process. But, as this example also shows, there is also quite a lot of useful additional information which varies from title to title and which may attract slightly different groups of readers. Here for instance, the promise of an account of the manners of the islands makes the socio-historical interest of this volume explicit. The question is how we can we format the title in such a way as to reveal that information without creating redundancy and repetition.

Unexpected title variations have also created challenges in gathering bibliographic data. The Irish surveys (which are mentioned but not detailed in our master bibliographies) have a significantly different title format.  Rather than ‘general views’ they are titled ‘Statistical Survey of the Country of… with observations on the means of their improvement’. We discovered this late in the process, which meant that we had to go back to the sources we had harvested information from and repeat the process. To complicate matters, even these variations are not consistent: anomalies such as the General view of the agriculture and mineralogy, present state and circumstances of the County Wicklow exist, making it very difficult to be sure we have identified all the relevant publications and holdings.

We will be experimenting with the format of the results page over the next few weeks, and hope to find a way to present the titles to include some of these interesting variations.



Apr 292015

1893 map of Shetland, from Cassell’s Gazetteer of Great Britain and Ireland; Published by Cassell and Company Limited, London.

One of the aims of our current project is to establish the cost and workflow requirements for creating a complete virtual collection of the County Surveys.  Many of the surveys are already available in various online archives but discovering them is not as easy as it could be and quality and accessibility remain quite variable. In the longer term, we hope to aggregate high quality full-text files that we can use for research-led text mining.  In order to establish the projected costs and labour involved in such a project, as part of the pilot we plan to identify one or two rare surveys and digitise them according to current best practices, documenting this process for ourselves and others. Clearly, as funds are limited, it makes sense to focus on volumes that are not already available in digital form and which are rare even in print.

One such candidate is the General view of the Agriculture of the Shetland Islands by John Shirreff which was published in Edinburgh by Constable & Co in 1814. This is a volume, according to one early 19th Century reviewer, which was of a peculiarly special interest to contemporary readers for it describes “a remote part of the British dominions, with which many readers are perhaps as little acquainted as with the Islands in the South Sea; and they exhibit a state of Society very different in several respects from that which prevails in the other provinces of Britain.”  Indeed, comparing Orkney and Shetland to the wilds of the American frontier, he suggests the inhabitants of these northern islands belong to a different, less civilised time and “bring into view a stage in the progress of improvement at which the inhabitants of the South has arrived some centuries ago, and which had been long since passed over by the people of almost every other part of the Island.” (The Farmer’s Magazine 15 (Aug 1814): 343) The exoticism, snobbery and geo-political bias of these remarks seems almost comical today, but they suggest that the contents of Shetland survey may be of particular importance to historians given the apparently substantial differences from more ‘advanced’ mainland practices.  Happily we will all be able to judge for ourselves soon, because a print copy of the Shetland survey is held here in Edinburgh at the Royal Botanic Gardens and they have kindly agreed to allow its digitisation: we’ll post about this process once it gets underway.”

Apr 152015


pot flour

Sir John Sinclair, ‘On Potato Flour’, York Herald (1817)

One of the key motivations behind the commissioning of the County Surveys was the Enlightenment zeal for ‘improvement’, which characterised late 18th Century Britain and drove the agricultural revolution. Until this time, the fundamentals of farming had changed little over the centuries, with many farmers still using the runrig and open field systems that originated in the middle ages. Attempting to modernise and increase productivity, ‘improving’ farmers developed new principles, worked to cultivate greater areas of land and increasingly large and applied new scientific thought and discoveries to their practices. These changes had vast social implications. As the notorious Highland Clearances showed, in some case they were devastating for farming communities and rural life. Yet, as thinkers like Sinclair knew, ‘improvement’ would be key to Britain’s future, enabling the nation to support its growing population in towns and in the colonies as it became a modern industrial state.

His 1817 article ‘On Potato Flour’ gives an insight into Sinclair view. In it, he writes of recent experiments with drying and milling potatoes to create a cost effective flour that could be stored for long sea voyages ‘without being injured by vermin.’ Pointing to the documentation of this process in The General View of the Agriculture of the County of Kent he states that such new approaches must be ‘prosecuted with zeal, until so important an object as that of enabling this country to supply itself with food, from its own resources, is attained.’  Indeed such is the importance of these new methods, he concludes, that they are ‘entitled to the attention and support of the public’.  National debate on such agricultural issues is both warranted and necessary, and in this light, it appears that the intended readership of and interest in the agricultural County Surveys is likely to have been considerably broader than we might now assume.