Welcome to the Data and Story Library, an archive of hundreds of datafiles for use by students and teachers of statistics and data science. We host data on a wide variety of topics to provide real-world examples. We recognize that data are not just numbers; data require a context. DASL provides background information about the data and a source reference whenever that information is available. That’s the “Story” part of our name.
DASL is brought to you by Data Description, Inc., creators of Data Desk.
Click on the “Search by Statistical Method” box and choose a method from the dropdown menu. DASL uses all search items together, so if you seek any data suitable for a method, be sure to keep the “Search by text” field empty. Alternatively, you can indicate both a statistics method and a text search to work together. You can also specify more than one method to search for datasets suitable for several methods.
Type some identifiable text into the “Search by text” box. DASL searches while you type so you can see if you are on target. DASL searches all available descriptive text including the datafile name and description.
You may want to exclude larger datasets for pedagogical reasons or because you are working in an environment with restricted capability. Just indicate the range of data sizes in the Filter. Of course, this constraint works along with all other search criteria.
The easiest trick for transferring DASL data to R is to use the Data desk program (available here: www.datadesk.com). Us the “Open in Data Desk” option for any datafile. Once in Data desk choose “Batch Export” from the R menu and drag the icons of the variables you wish to move into the window provided. (You don’t need to move all the variables in the dataset.) Click the Save Data button to save the data to the location you choose in text format. Then click the Save R Script option to save an R script that will read the data and on which you can build your own program. Note that if you first perform some analyses or make plots in Data desk you can export those to R as well, creating R programs that will reproduce them. Data desk is thus a fully graphical interface for basic R functions.
DASL has been developed by Paul Velleman. Datasets in textbooks he co-authored can generally be found by searching on the exercise title in the text for exercises that indicate datafiles are available.
You will see some datafiles with years appended to their names appearing multiple times. When a new version of the data or new data to add to the original file is available, DASL indicates that with a new datafile with the date appended. We do this so that teachers and books using older versions of the data will continue to have those versions available.
DASL is proudly brought to you by Data Description, Inc.
Data Desk was developed by a Cornell University Statistics Professor. It is easy to learn and to use. Spend less class time on code as students point to what they want to do and drag variables onto plots and tables to specify what to do it with.
Looking for help with your statistics class? You’ve come to the right place. Data desk is entirely point-and-click graphical. You won’t need to learn a new language or worry about punctuation and syntax.