Larger-than-RAM Disk-Based Data Manipulation Framework


[Up] [Top]

Documentation for package ‘disk.frame’ version 0.3.7

Help Pages

A C D E F G H I L M N O P Q R S T V W Z misc

-- A --

add_chunk Add a chunk to the disk.frame
add_count.disk.frame The dplyr verbs implemented for disk.frame
add_tally.disk.frame The dplyr verbs implemented for disk.frame
all_df.chunk_agg.disk.frame One Stage function
all_df.collected_agg.disk.frame One Stage function
anti_join.disk.frame Performs join/merge for disk.frames
any_df.chunk_agg.disk.frame One Stage function
any_df.collected_agg.disk.frame One Stage function
arrange.disk.frame The dplyr verbs implemented for disk.frame
as.data.frame.disk.frame Convert disk.frame to data.frame by collecting all chunks
as.data.table.disk.frame Convert disk.frame to data.table by collecting all chunks
as.disk.frame Make a data.frame into a disk.frame

-- C --

ceremony_text Show the code to setup disk.frame
chunk_arrange The dplyr verbs implemented for disk.frame
chunk_distinct The dplyr verbs implemented for disk.frame
chunk_group_by Group by within each disk.frame
chunk_lapply Apply the same function to all chunks
chunk_summarise Group by within each disk.frame
chunk_summarize Group by within each disk.frame
chunk_ungroup Group by within each disk.frame
cimap Apply the same function to all chunks
cimap.disk.frame Apply the same function to all chunks
cimap_dfr Apply the same function to all chunks
cimap_dfr.disk.frame Apply the same function to all chunks
cmap Apply the same function to all chunks
cmap.disk.frame Apply the same function to all chunks
cmap2 'cmap2' a function to two disk.frames
cmap_dfr Apply the same function to all chunks
cmap_dfr.disk.frame Apply the same function to all chunks
collect.disk.frame Bring the disk.frame into R
collect.summarized_disk.frame Bring the disk.frame into R
collect_list Bring the disk.frame into R
colnames Return the column names of the disk.frame
colnames.default Return the column names of the disk.frame
colnames.disk.frame Return the column names of the disk.frame
compute.disk.frame Compute without writing
copy_df_to Move or copy a disk.frame to another location
count.disk.frame The dplyr verbs implemented for disk.frame
create_chunk_mapper Create function that applies to each chunk if disk.frame
create_dplyr_mapper Kept for backwards-compatibility to be removed in 0.3
csv_to_disk.frame Convert CSV file(s) to disk.frame format

-- D --

delayed Apply the same function to all chunks
delete Delete a disk.frame
dfglm Fit generalized linear models (glm) with disk.frame
df_ram_size Get the size of RAM in gigabytes
disk.frame Create a disk.frame from a folder
distinct.disk.frame The dplyr verbs implemented for disk.frame
distribute Shard a data.frame/data.table or disk.frame into chunk and saves it into a disk.frame
do.disk.frame The dplyr verbs implemented for disk.frame

-- E --

evalparseglue Helper function to evalparse some 'glue::glue' string

-- F --

filter.disk.frame The dplyr verbs implemented for disk.frame
foverlaps.disk.frame Apply data.table's foverlaps to the disk.frame
full_join.disk.frame Performs join/merge for disk.frames

-- G --

gen_datatable_synthetic Generate synthetic dataset for testing
get_chunk Obtain one chunk by chunk id
get_chunk.disk.frame Obtain one chunk by chunk id
get_chunk_ids Get the chunk IDs and files names
glimpse.disk.frame The dplyr verbs implemented for disk.frame
groups.disk.frame The shard keys of the disk.frame
group_by.disk.frame A function to parse the summarize function
group_vars.disk.frame Column names for RStudio auto-complete

-- H --

hard_arrange Perform a hard arrange
hard_arrange.data.frame Perform a hard arrange
hard_arrange.disk.frame Perform a hard arrange
hard_group_by Perform a hard group
hard_group_by.data.frame Perform a hard group
hard_group_by.disk.frame Perform a hard group
head.disk.frame Head and tail of the disk.frame

-- I --

imap Apply the same function to all chunks
imap.default Apply the same function to all chunks
imap_dfr Apply the same function to all chunks
imap_dfr.default Apply the same function to all chunks
imap_dfr.disk.frame Apply the same function to all chunks
inner_join.disk.frame Performs join/merge for disk.frames
insert_ceremony Show the code to setup disk.frame
IQR_df.chunk_agg.disk.frame One Stage function
IQR_df.collected_agg.disk.frame One Stage function
is_disk.frame Checks if a folder is a disk.frame

-- L --

lazy Apply the same function to all chunks
lazy.disk.frame Apply the same function to all chunks
left_join.disk.frame Performs join/merge for disk.frames
length_df.chunk_agg.disk.frame One Stage function
length_df.collected_agg.disk.frame One Stage function

-- M --

make_glm_streaming_fn A streaming function for speedglm
map Apply the same function to all chunks
map.default Apply the same function to all chunks
map.disk.frame Apply the same function to all chunks
map2 'cmap2' a function to two disk.frames
map_by_chunk_id 'cmap2' a function to two disk.frames
map_dfr.default Apply the same function to all chunks
map_dfr.disk.frame Apply the same function to all chunks
max_df.chunk_agg.disk.frame One Stage function
max_df.collected_agg.disk.frame One Stage function
mean_df.chunk_agg.disk.frame One Stage function
mean_df.collected_agg.disk.frame One Stage function
median_df.chunk_agg.disk.frame One Stage function
median_df.collected_agg.disk.frame One Stage function
merge.disk.frame Merge function for disk.frames
min_df.chunk_agg.disk.frame One Stage function
min_df.collected_agg.disk.frame One Stage function
move_to Move or copy a disk.frame to another location
mutate.disk.frame The dplyr verbs implemented for disk.frame

-- N --

names.disk.frame Return the column names of the disk.frame
nchunk Returns the number of chunks in a disk.frame
nchunk.disk.frame Returns the number of chunks in a disk.frame
nchunks Returns the number of chunks in a disk.frame
nchunks.disk.frame Returns the number of chunks in a disk.frame
ncol Number of rows or columns
ncol.disk.frame Number of rows or columns
nrow Number of rows or columns
nrow.disk.frame Number of rows or columns
n_df.chunk_agg.disk.frame One Stage function
n_df.collected_agg.disk.frame One Stage function
n_distinct_df.chunk_agg.disk.frame One Stage function
n_distinct_df.collected_agg.disk.frame One Stage function

-- O --

output_disk.frame Write disk.frame to disk
overwrite_check Check if the outdir exists or not

-- P --

print.disk.frame Print disk.frame
pull.disk.frame Pull a column from table similar to 'dplyr::pull'.

-- Q --

quantile_df.chunk_agg.disk.frame One Stage function
quantile_df.collected_agg.disk.frame One Stage function

-- R --

rbindlist.disk.frame rbindlist disk.frames together
rechunk Increase or decrease the number of chunks in the disk.frame
recommend_nchunks Recommend number of chunks based on input size
remove_chunk Removes a chunk from the disk.frame
rename.disk.frame The dplyr verbs implemented for disk.frame

-- S --

sample_frac.disk.frame Sample n rows from a disk.frame
sd_df.chunk_agg.disk.frame One Stage function
sd_df.collected_agg.disk.frame One Stage function
select.disk.frame The dplyr verbs implemented for disk.frame
semi_join.disk.frame Performs join/merge for disk.frames
setup_disk.frame Set up disk.frame environment
shard Shard a data.frame/data.table or disk.frame into chunk and saves it into a disk.frame
shardkey Returns the shardkey (not implemented yet)
shardkey_equal Compare two disk.frame shardkeys
show_boilerplate Show the code to setup disk.frame
show_ceremony Show the code to setup disk.frame
srckeep Keep only the variables from the input listed in selections
srckeepchunks Keep only the variables from the input listed in selections
summarise.disk.frame A function to parse the summarize function
summarise.grouped_disk.frame A function to parse the summarize function
summarize.disk.frame A function to parse the summarize function
summarize.grouped_disk.frame A function to parse the summarize function
sum_df.chunk_agg.disk.frame One Stage function
sum_df.collected_agg.disk.frame One Stage function

-- T --

tail.disk.frame Head and tail of the disk.frame
tally.disk.frame The dplyr verbs implemented for disk.frame
tbl_vars.disk.frame Column names for RStudio auto-complete
transmute.disk.frame The dplyr verbs implemented for disk.frame

-- V --

var_df.chunk_agg.disk.frame One Stage function
var_df.collected_agg.disk.frame One Stage function

-- W --

write_disk.frame Write disk.frame to disk

-- Z --

zip_to_disk.frame 'zip_to_disk.frame' is used to read and convert every CSV file within the zip file to disk.frame format

-- misc --

[.disk.frame [ interface for disk.frame using fst backend