Downloads survey data from GESIS given a record or a dataset ID. To download
data, the session must be authenticated using gesis_auth
.
By downloading data from the GESIS data archive, you agree to its terms of use. This function is intended to be used for singular, programmatic access to survey datasets. Please refrain from using it for large-scale batch downloads.
Usage
gesis_data(
record,
download_purpose = NULL,
path = tempdir(),
type = "data",
select = NULL,
prompt = interactive()
)
Arguments
- record
Object of class
gesis_record
as returned bygesis_search
andgesis_get
or dataset ID. If a dataset ID is passed, the function performs a call togesis_get
.- download_purpose
Purpose for the use of the research data. You are required to use the downloaded data solely for the specified purpose (see also the terms of use). The following values are supported:
final_thesis
: for final thesis of the study programme (e.g. Bachelor/Master thesis)commercial_research
: for research with a commercial missionnon_scientific
: for non-scientific purposesfurther_education
: for further education and qualificationscientific_research
: for scientific research (incl. doctorate)studies
: in the course of my studieslecturer
: in a course as a lecturer
- path
Path where the downloaded file should be stored. Can be path to a directory or a file. If a directory path, it is attempted to infer the file name from the file to be downloaded. If this is not possible, the file is stored in a file called
gesis
with no file extension. If a file path is passed, the file is directly downloaded to this path. Defaults to a temporary directory path.- type
Type of data to download. A list of file types available for a given record can be retrieved using
gesis_file_types
. Common types include:data
: Research or survey dataset, usually in a Stata or SPSS data format.questionnaire
: Survey questionnaire, usually in PDF formatcodebook
: Survey codebook, usually in PDF formatreport
: Method, project or technical reports, usually in PDF format.syntax
: Code files, e.g. for replication or data cleaningother
: Other files like READMEs, method reports, or variable lists, usually in PDF format.uncategorized
: Other files that are not categorized, usually used for external links to full texts (DOI, URN)
Defaults to
"data"
because downloading PDF or HTML files rarely makes sense in R. Note that"data"
,"report"
, and"other"
are keywords that cover all types of datasets, written reports, and other documents respectively.- select
Character vector to select a data file in case multiple files are available for the selected data type. The character string is matched against the file label using regular expressions. This argument can also be used to match explicitly for file extensions, e.g.
"\\.sav"
or"\\.dat"
. Can also be of length > 1 in which case the regular expressions are matched in the order of their index positions. A list of file labels for a given record can be retrieved usinggesis_files
. IfNULL
, multiple files are detected, and the session is interactive, prompts the user to select a file. Defaults toNULL
.- prompt
If
TRUE
, allows the function to open an interactive prompt in case multiple files are found andselect
is eitherNULL
or fails to select a file unambiguously. IfFALSE
, throws an error in such a case. Defaults toTRUE
if run in an interactive session.
Value
The path to the downloaded file. Depending on the selected file, there are different ways to read the file contents. Traditionally, data files are offered in Stata and SPSS file formats and can be read using the haven package.
Details
Access and refresh tokens are automatically attached to the requests sent if possible. This is done only for URLs pointing to the domain gesis.org to avoid sending authentication information to other domains.
Examples
if (FALSE) { # \dontrun{
# retrieve a search record to pass on to gesis_data()
record <- gesis_search(
"allbus",
publication_year = c(2019, 2020),
type = "research_data"
)
# in interactive mode, the function can be run without arguments
path <- gesis_data(record[[1]])
# in other cases, certain arguments should be provided
path <- gesis_data(record[[1]], download_purpose = "non_scientific", select = "\\.sav")
# you can also simply pass a dataset ID
path <- gesis_data("ZA3753", select = "\\.por")
# data files must be read using other tools and packages, e.g. haven
haven::read_por(path)
# ... or pdftools
path <- gesis_data("ZA3753", select = "fb\\.pdf", type = "questionnaire")
pdftools::pdf_text(path)
# In some cases, a single selection regex might be difficult to work
# with, e.g., if multiple files with the same format exist.
# In this case, you may pass multiple regular expressions which are
# evaluated back by back
gesis_data(
"ZA7716",
select = c("\\.sav", "main"),
download_purpose = "non_scientific"
)
# `gesis_data` can be used to download and execute R code
record <- gesis_get("SDN-10.7802-2109")
parse(gesis_data(record, type = "syntax", select = "car_ref_all"))} # }