The DATAMP Steward FAQ
Table Of Contents
Stewardship Basics
Data Collection
Using DATAMP

What is a "Data Steward"?

"Data Steward" is the name that we give to the DATAMP volunteers who are responsible for collecting and maintaining patent data.

Stewards are responsible for collecting patent data, converting it into a format suitable for the DATAMP server to use it, and keeping the data for the patents they are stewarding up to date in the database. There is a link on each patent detail page to allow users to send email to the steward for the patent, so that users can contribute additional information.

Stewards typically have a strong interest in a particular kind of patented tool, and in many cases will already have some patent data collected. For these folks, stewardship is a natural extension of their patent research. Collecting and organizing the data helps to increase their own knowledge of their area, and the fact that it is publicized for others to use is a nice side-effect.


How much work is involved?

That depends on how many patents you are willing to support. The initial data collection can be time-consuming-- you need to collect and enter all of the data for the patents, and convert the data into a special text format (XML) in order for DATAMP to understand it.

Once the initial data is collected, though, it is not a lot of work. You will get the occasional emailed question about one of your patents, or find additional information to add to your listings.


Do I need to be a patent expert?

You do not need to be an "expert", but you will need to know how to navigate the USPTO site in order to collect the data. Since the older patents at the USPTO are only searchable by patent number and classification, you will need a basic understanding of patent classification and how to search for patents of a particular class.

If you are fortunate enough to live near a patent library, you can also do much of your data collection using the microfilm and old patent gazettes. You will probably still need to use the USPTO web site in order to get your pictures, but most of the mundane collection can be done in the library.


How do I start?

In order for stewards to get the hang of what is involved and make sure that they are capable of and willing to do the work that is needed, we ask that a steward collect and format the data for at least 50 patents before we allow them to upload to DATAMP. Once the data is collected and reviewed, a steward account will be created for them, and they are able to upload and edit patent data on the server.

This is not to say you are entirely on your own, however!

The DATAMP stewards use a mailing list hosted by Chris Swingley at UAF to distribute information and ask questions. This mailing list is open to the public. We also hold occasional on-line chats to work through specific issues. These chats are announced on the mailing list, and are open to anyone who is interested.

Persons interested in stewardship should join the DATAMP mailing list to ask questions and learn from other stewards.


How much information do I need to collect?

There are three categories of information which is collected in DATAMP: Patent Information, Tool Information, and Steward Information.

Patent information comes from the patent specification itself, and includes:

Tool information comes from markings on existing tools, old catalog information, and other tool collecting research. It includes:

Steward information is where the steward gets to add his knowledge:

In order to upload the data, at least the patent information must be provided. The other information is optional, but in most cases will be filled in. The more data we have and the more cross references there are, the more useful DATAMP is as a research tool.


What patents should I research?

A steward normally has one or two areas that he is particularly interested in, which usually coincides with his collecting interests. If your collecting interest is narrow or fairly obscure (say 'bevel gauges'), chances are there is not a steward who is responsible for these patents and you can simply fill in the "holes" in our data.

If your interest is in a more broad area (say 'planes' or 'levels'), there may already be one or more stewards collecting these patents. In this case, you should see what patents are already in our database, and contact the steward(s) for these patents to see if there are some areas they would like help in.

There are thousands of patents that would be "interesting" to tool and machine collectors, so there should be no shortage of research work for interested parties. Simply find a "hole" in our data, and try to fill it!


How should I format my data?

You are free to format and collect data in any format you want. Patents can be entered one at a time using the gui screens or uploaded in xml files. For xml file formatting details check out the DATAMP XML specification.


Are there tools to help create the XML?

Some stewards (particularly those with an HTML background) are comfortable with editing large XML files directly. XML is a human-readable format, so this is not out of the question.

Other stewards have taken a script approach to XML creation. They collect their data in some format that is convenient to them, and use a script to convert their data to the requisite XML. In particular, ace data steward Jeff Joslin has created a perl script to convert a specially-formatted Excel spreadsheet into DATAMP XML. Jeff has also made this spreadsheet available to others so that they can do data collection, send the file to Jeff, and have him convert it to XML. Contact us if you already have patent data in a different format. We may be able to load it into the DATAMP database.

A few stewards chose to enter patents using the gui screens rather than uploading xml files. This is perfectly acceptable.


How do I upload my data?

Once you have collected your data and had it reviewed by one of the stewards, you will be given a DATAMP account. This allows you to log into the DATAMP server and access a set of pages for stewards only.

On the steward page there are a number of links to allow you to manage your steward account (changing passwords, etc), analyze your patent data, and edit/upload patent data. To upload your XML to the server, you simply go to the "Upload XML file" link, enter the name of your local file, and hit "Submit". This upload is referred to as the Initial Data Load or IDL for short.

The IDL process will first do a validation check, making sure that the file is both syntactically valid (i.e. no XML errors), and sematically valid (i.e. contains correct data). If the file checks out, the patent data will be added to the database.

Since there is a fair bit of work to insert the data into the database (generating cross-references and the like), this process can take some time. Because of this, we ask that you upload less than 250 patents at a time. Stewards with greater numbers of patents should split their XML files into smaller chunks.


How do I add/edit information?

Once the IDL has been done and the data is in the database, you have two options for maintaining or updating it. You can either make changes to the XML and resubmit it (good if there are lots of changes), or you can use the editing features of the DATAMP UI (better for minor tweaks).

If you are logged in as the steward, the patent display will be slightly different for patents that you are the steward for. In these cases, the link that allows the user to send mail to the steward will change to say "Edit this patent's information". Clicking on this link will take you to an editing page that allows you to modify the basic patent information (number, type, description, etc).

In cases where there may be multiple pieces of information (like for person names and pictures), there will be "[add]" and "[edit]" links above the information to allow you to add, delete, or change information.

For example, if you find out that a particular company manufactured a patent of yours, you would log on as steward, display the patent in question, and click the "[add]" link above the manufacturer table. This takes you to a dialog that allows you to enter the new information. When you submit this information, it is automatically added to the database.


What should I do for pictures?

Images and descriptions are the things that make DATAMP stand out. Small, clean, easy-to-browse images, combined with useful descriptions and extensive searching capabilities will be what makes this a valuable resource for tool collectors.

What that means is that we need to be very careful to provide good images, and display them consistently. Please refer to the DATAMP Image Guidelines document for details on the limitations of image display in DATAMP, and guidelines for images and how they are used.


Is there a way to check my data?

In order to assist our stewards, we have created a series of special "reports" that analyze a stewards data for various error conditions. We check for things like patents not issued on a Tuesday, missing titles and descriptions, duplicate classifications, and so on.

The Steward Reports are available off the main steward page.

The available reports are constantly growing as new needs are discovered. If you have an idea for a new report, bring it up on the mailing list and we'll see what we can do about it.