spod_quick_get_zones()
is a new function to quickly
get municipality geometries to match with the data retrieved with
spod_quick_get_od()
(PR #163).
This function is experimental, just as the
spod_quick_get_od()
function, as the API of the Spanish
Ministry of Transport may change in the future. It is only intended for
quick analysis in educational or other demonstration purposes, as it
downloads very little data compared to the regular
spod_get_od()
, spod_download()
and
spod_convert()
functions. The requests are cached in memory
of the current R session with memoise
package, so repeated
calls to spod_quick_get_zones()
will not cause repeated
requests to the API and will allow the user to get the data faster from
repeat calls.
Experimental spod_check_files()
function allows to
check consistency of downloaded files with Amazon S3 checksums (PR #165).
ETags for v1 data are stored with the package, and for v2 data they are
fetched from Amazon S3. The checks may fail for May 2022 data and for
some 2025 data, as the remote cheksums that are used for checking the
file consistency are incorrect. We are working on solving this in future
updates, for now, kindly rely on the built-in file size checks of
spod_download()
, spod_get()
, and
spod_convert()
.
spod_get()
and spod_convert()
are now
up to x100,000 faster when you have all (or a lot of) data downloaded,
but only requesting several days in the call to spod_get()
or spod_convert()
. This is thanks to a new smarter
filtering strategy (issue #159,
PR #166).
Metadata is now fetched from Amazon S3 storage of the original data files, which allows validation of downloaded files (issue #126) with both size and checksum. PR #165.
Metadata fetched by spod_available_data()
has extra
columns such as data type
, zones
and
period
, see help ?spod_available_data()
for
details.
Memory allocation is now delegated to DuckDB
engine,
which by default uses 80% of available RAM. Beware that in some HPC
environments this may detect more memory than is actually available to
your job, so set the limit manually to 80% of RAM available to your job
with max_mem_gb
argument of spod_get()
,
spod_convert()
, spod_connect()
functions. This
will also improve performance in some cases, as DuckDB is more efficient
than R at memory allocation (PR #167).
More reliable, but still multi-threaded data file downloads using
base R utils::download.file()
instead of
curl::multi_download()
which failed on some connections
(issue #127),
so now curl
dependency is no longer required. PR #165.
spod_quick_get_od()
is working again. We fixed it to
work with the updated API of the Spanish Ministry of Transport (PR #163,
issue #162).
This function will remain experimental, just as the
spod_quick_get_zones()
function, as the API of the Spanish
Ministry of Transport may change in the future. It is only intended for
quick analysis in educational or other demonstration purposes, as it
downloads very little data compared to the regular
spod_get_od()
, spod_download()
and
spod_convert()
functions. The requests are cached in memory
of the current R session with memoise
package, so repeated
calls to spod_quick_get_od()
will not cause repeated
requests to the API and will allow the user to get the data faster from
repeat calls.
spod_convert()
now accepts
overwrite = 'update'
with
save_format = 'parquet'
(#161)
previously it failed because of the incorrect check that asserted only
TRUE
or FALSE
(#160)
spod_cite()
function to easily cite the package and the
data (#134)hour
column is superseeded by time_slot
column in the output of spod_get()
and
spod_convert()
. time_slot
is deprecated. It is
still present in the tables, but will be removed in the end of 2025 but
going forward please use the new hour
column. Otherwise it
is exactly the same as before, this is just a name change. (#132)spod_quick_get()
does not rely on metadata download
anymore and can be used without setting the data directory with
spod_set_data_dir()
(and therefore does not cause a warning
if the data directory is not set).
hour
(ex-time_slot
) column is now right
next to the date column in the output of spod_get()
and
spod_convert()
(#)
maximum available CPU cores check is now turned off to improve compatibility when running the package from within a container in high performance computing environments (see #130 and #140 for details)
minor documentation improvements and updates
minor bug fixes