Skip to main content

The Ruby study pioneers the use of online US voter rolls

This post highlights how Peter Bradish developed tools based on US voter lists to provide new datasets for American Rubys.  For those unfamiliar with US voter lists ( like me), what follows at the end of the post is an excellent FAQ to enlighten you, including technical explanation.   Thanks to Peter for sharing his innovative approach and to Paul Howes for his input.  


A year ago I responded to the plea for Ruby Project volunteers and took on the nine Rubys
listed in the 1900 US Census in Florida. Since then I’ve learned and discovered a great deal. After
taking the Pharos “Introduction to One-Name Studies” course last winter I understood the great potential of bulk data gathering for an ONS newbie such as myself. 
After the first application of the process to my own ONS (eg. using FreeBMD) I realized its
potential for the Ruby Project when it was suggested that I gather the Ruby information from
the Florida Registered Voters online database. With nearly 400 Rubys, the cutting and pasting
or re-keying of names, gender, date of birth and addresses looked like a month’s worth of
boredom and stress. So I applied what I’d learned to easily gather and record data from
FreeBMD, the database for births, marriages and deaths in England and Wales.
Basically this meant downloading the Florida voter database (67 county files totaling 2GB),
writing and using a simple program to select only the Ruby records (60KB), employing a
spreadsheet to select, sort and clean relevant data, and writing a program to convert the data exported
from the spreadsheet to a GEDCOM file for sending to the project team leader to merge into the 
Ruby Project master file.
It took a long time to accomplish the Florida Ruby voter data extraction and conversion. However, having learned how to do it for one state, applying that learning to the other states with online voter rolls meant they went a lot quicker.  Now it only takes a couple of hours to do a single state plus the time to determine how the data has been recorded and writing a source citation.
Florida Population Map - 2010

EXAMPLE OF VOTER LIST INFORMATION
Ruby, John James was born 17 September 1974, is male, registered as No Party Affiliation, residing at 700 Villa San Marco Dr, #444, St Augustine, Florida 32086-4142. Florida voter ID number 123456789. The voter lists a mailing address and probably prefers you use it: 11111 S 11Th East Ave Bixby OK 74008-8227. This is the most recent information, from the Florida voter list as of 30 November 2018.
Previous information:
31 May 2013 voter list: John James Ruby, 700 Villa San Marco DR, #444, St Augustine, FL 32086 No Party Affiliation.



There are nine states with easily accessible voter databases: Arkansas, Colorado, Connecticut,
Delaware, Florida, Michigan, Ohio, Oklahoma and Rhode Island. Each has an online capability
allowing the user to drill down by surname and given name to view individual records.  A total of  2015 records have been extracted from those nine states for the Ruby Project.  Given that the nine states make up about 20% of the US population, we estimate that there may be as many as 10,000 Ruby people in the US!
Just extracting the data is not the end of the process, though.  To make the data meaningful, one has to group families living at the same address, which is a process that cannot be mechanized.  For two states (Ohio and Oklahoma), one has to guess the sex of the individual too, which is quite time-consuming and not as simple as it seems given modern naming patterns in America!
The resulting data has lead to some insights.  First, we have a ready supply of modern information to help with family reconstruction and to add to future obituaries as we spot them.  Second, there is a very small proportion of people (well below 1%) who are registered to vote in more than one state!  Some of that is no doubt due to people moving around and the timing differences in online databases.  Florida’s data is practically up to date, whereas Delaware’s dates from 2015.
Lastly, it’s worth noting that the Florida population has grown from just over half a million people in 1900 to over 21 million today, a roughly 40-fold increase. Meanwhile, the population of Rubys has gone from 9 to over 400, but that’s not even including people below the voting age!  Something about Florida is particularly attractive to people named Ruby!

The FAQs
Are all the individuals in the voter lists living? Thus, how does one connect them to a line of Rubys going backward?
All the people are "supposed" to be living but some don't get removed from their state's voter list when they pass away or move out of state. We spotted several duplicate people who have moved between states and one or two who have died since the lists were prepared. The state has to be notified of the person's death or removal. We suspect some states have a period of inactivity  (i.e., the person not voting) which triggers a requirement to re-register but we haven't checked each state's requirements. That people are on the list is sufficient evidence of their existence, however. We can only connect them going backward if we have the records to do so, such as documented birth date and/or marriage and/or name(s) which match. We were able to spot about 100 people who matched with records we already had and merged them in as we added all the data to our master file. 

Do the households identify the relationships between individuals living in the same household? 
No. We have made our best guess about relationships based on the ages and sexes of people living at the same address. On the other hand, since everyone is alive, we are following good genealogical practice and are not publishing the data ourselves, So, it's not crucial whether we have got the relationships correct.  We have plenty of time to sort out any errors. 

What is the span of years of a voter list, for example, the frequency of updating for people deceased, or no longer in the state?
The situation varies by state.  Some are up to date daily (working days) and some are providing new lists annually or less frequently.  All the voter registration records are public information and not subject to privacy laws you might expect. 

How did you move these lists from data to genealogical software or directly to a website database? 
I wrote a program to extract specific surname records (e.g., Ruby) from the massive complete state databases which converts it to a CSV (comma or tab separated values) file which can be imported into a spreadsheet. Once imported, I cleaned the non-genealogical data out (e.g., their past voting history which is whether they voted or not, in person or absentee, registration date, etc.). Then ALL CAPs words and names were converted to first capital only. This is followed by concatenation of address fields into one residence place string. Then first and middle names are concatenated into a given name. Then birth dates are all converted from the US date format MM/DD/YYYY into European format, DD/MM/YYYY used by most genealogists. All columns are then arranged into a common order which I use for the program I wrote to convert the data into a GEDCOM file with name, DoB and residence tags and a source citation for the voter registration database. I also typically add a column with a "sortable" DoB which can be used to "properly" sort the data in actual date order as the data is normally only provided in text format which would sort by month (US) or day (European) order.  As you can appreciate, this is the technical part which won't be simple for most genealogists. This is the reason we think the Ruby study has pioneered the use of such information. 

OK, I accept this is new and appreciate that you will not be publishing the data. So why bother with it at all?
Good question! Yes, we will NOT be publishing this information online until we know a particular individual has died. However, we have already found it helpful. Let's say we have a daily feed from companies providing obituaries like Tributes.com or Legacy.com.  Many American obituaries contain lots of information on descendants and other relatives who are still alive. If those folks live in one of the nine states, we can immediately connect them up with accurate relationships, birth dates and so on. Same thing with news items: having a large number of living RUBYs allows us to identify many people who appear in news items and make our database a whole lot more interesting! To summarize, look at it this way: with the voter lists in our file we have tomorrow's census releases today!  












Comments

Popular posts from this blog

PASSING THE TORCH

Thank you to Paul, who took on a project that was untried and became a rather large initiative.  His post below is an excellent summary. It is just a fact that without Paul the bumpy start to this concept would never have achieved what it did.  My own contributions never met their unrealistic goals - oh sure I will cover every Ruby in Canada - and due to many shifting priorities, my commitment regretfully decreased as time progressed but Paul persevered and never gave up the goal - Kudos! Peggy Chapman This is the final note from me as project manager for the initial stage of the Ruby One-Name Study, started by the Guild of One-Name Studies as a means of demonstrating what Guild members could do when working together in a tight timetable to celebrate the Guild’s 40 th birthday in September 2019. We started this project early in 2018 when three of us, me in Florida, Peggy in Canada and Karen in Australia had a few video-conference discussions to figure out how best to take
In today's post, Paul Howes describes how the Ruby study has taken on a contemporary approach by looking at UK files that reflect a primarily English one-name study of a different era.  Aside from electronic vs paper, a primary focus in today's one-name study is family reconstruction from the beginning.  A different dynamic Thanks to another member of the Guild of One-Name Studies, we recently became aware of some considerable work done on a large number of Ruby families by a man named Reed, now deceased.  The member had prepared a large number of electronic files for transfer to the Society of Genealogists (SoG)* together with ten boxes of paper files.  The SoG now owns this material but has kindly given us access to Mr Reed's work in advance of its being fully accessioned and we acknowledge with thanks their kind contribution to our effort. Mr Reed was not a Guild member but as I viewed the paper copy of his material at the SoG it was clear that he had gone about

DRY Genealogy and a word from the new Ruby team

For readers who are not Guild members, the Ruby project will be transferring to "real" Rubys the end of September.  Michael Ruby has introduced himself and provided a very interesting read on DRY approach to genealogy.  I think many of us can relate to the amusing but true definiton of WET!  And with Michael's permission, I would love to adopt the sentence " Genealogy as a whole is forever beautifully unfinished."   Peggy Homans Chapman Hello, everyone. My name is Michael Ruby. I am part of the team that will be inheriting the Ruby One-Name Study on 30 September. I would like to take this opportunity to express gratitude, to introduce myself, and to offer some initial thoughts about the future of the study by way of this blog post’s main body. In it, I wish to offer something that I hope is at least a little bit fresh: a computer science-style argument for the value of approaching genealogy through the one-name study.   I have a feeling that most geneal