DataExplorer 0.8.1
Enhancements
- #111: Continuous distributions can now be plotted with different scales, i.e., histogram, density, boxplot, scatterplot.
- #126: Cleaned up labels in legend guide.
- #127 (PR): Added option to plot columns with missing values only in
plot_missing
.
- Cleaned up code for
create_report
.
Bug Fixes
- #109: Fixed a bug causing unordered bar charts.
- #114: Removed redundant message in
dummify
.
- #116: Fixed pandoc document conversion error 99.
- #120: Fixed type
logical
being parsed as symbol
in configure_report
.
- #121: Fixed missing value bug when
split_columns(..., binary_as_factor = TRUE)
.
- #130 (PR):
plot_prcomp
now drops columns with zero variance.
DataExplorer 0.8.0
New Features
- #92: Added
update_columns
to transform any selected columns.
Enhancements
- #87: Added
configure_report
function to customize report content.
- #89: Added option to customize
geom_text
and geom_label
arguments.
- #91:
create_report
now displays full report directory after completion.
- #95: Added better exception handling for
plot_bar
.
- #98: Added band customization to
plot_missing
.
- #100: Switched
geom_text
to geom_label
.
- #103: Report title can now be customized in
create_report
.
- #108: Added option to treat binary features as discrete in
plot_bar
, plot_histogram
, plot_density
and plot_boxplot
.
- Updated d3.min.js to v5.9.2.
Bug Fixes
- #88: Added
plot_intro
to report config.
- #90: Added first plot in
plot_prcomp
to output and page_0
.
- #94: Fixed typo for PCA.
DataExplorer 0.7.1
Enhancements
- #86: Replaced
gridExtra::grid.arrange
with facets.
- Added seeds to vignette and README for re-producible examples.
- Hid all internal functions.
DataExplorer 0.7.0
New Features
- #72: Added
plot_qq
for QQ plot.
- #76: Added
plot_intro
to visualize results of introduce
.
Enhancements
- #42: Applied S3 methods for plotting functions.
- #77:
dummify
now works on selected columns.
- #78: All ggplot objects from
plot_*
are now invisibly returned. As a result, extracted profile_missing
from plot_missing
for missing value profiles.
- #83: Removed all deprecated functions.
- #85: Users can now specify number of rows/columns for plot page layout.
plot_prcomp
now passed scale. = TRUE
to prcomp
by default.
- Added
sampled_rows
argument to plot_scatterplot
.
- Added option to parallelize plot object construction.
- Updated default config for
create_report
.
Bug Fixes
- #74: Fixed a bug causing
create_report
failure due to zero complete rows.
- #75: Fixed a bug in
plot_str
when plotting data.frame with more than 100 columns.
- #82: Removed hard-coded scales from all plot functions.
- Fixed a bug causing wrong column indices in
split_columns
.
- Fixed a bug using standard deviation instead of variance in
plot_prcomp
.
DataExplorer 0.6.1
Enhancements
- Updated vignette for better clarity.
- #71: Added better error handler for
plot_prcomp
.
Bug Fixes
- #69: Fixed bug causing
create_report
failure (specifically from plot_prcomp
) when y
is specified.
- Added more unit tests for
create_report
and plot_prcomp
.
DataExplorer 0.6.0
New Features
- #15: Added
plot_prcomp
to visualize principal component analysis.
- #54: Extracted
dummify
from plot_correlation
as a new function.
- #59: Added
introduce
for basic metadata.
Enhancements
- #41:
create_report
can now be customized.
- #53: Added page number for plots that span multiple pages.
- #56: Added support for theme and customization for individual components.
- #62:
plot_bar
now supports optional measures (in addition to categorical frequency) using argument with
.
- #66: Feature engineering functions works on other classes in addition to just data.table.
plot_missing
:
- Percentage text labels from output plot now has 2 decimals to prevent small percentages from being truncated to 0%.
- Added example to quickly drop columns with too many missing values.
- Added
.ignoreCat
and .getAllMissing
to helper.
Bug Fixes
- #55: Fixed bugs and updated vignette with latest functions.
- #57: Fixed
plot_str
bug for not supporting S4 objects.
- #63: Fixed
plot_histogram
and plot_density
not working with column names containing spaces.
DataExplorer 0.5.0
New Features
- #48: Added
plot_scatterplot
to visualize relationship of one feature against all other.
- #50: Added
plot_boxplot
to visualize continuous distributions broken down by another feature.
Enhancements
- #44: Added option to exclude categories in
group_category
.
- #45: Added title option for all plots.
- #46: Added option to exclude columns in
set_missing
.
- #49 [Breaking Change]: Switched package to tidyverse style. All old functions are in
.Deprecated
mode. List of name changes in alphabetical order:
BarDiscrete
-> plot_bar
CollapseCategory
-> group_category
CorrelationContinuous
-> plot_correlation(..., type = "continuous")
CorrelationDiscrete
-> plot_correlation(..., type = "discrete")
DensityContinuous
-> plot_density
DropVar
-> drop_columns
GenerateReport
-> create_report
HistogramContinuous
-> plot_histogram
PlotMissing
-> plot_missing
PlotStr
-> plot_str
SetNaTo
-> set_missing
SplitColType
-> split_columns
- #52: Combined
CorrelationContinuous
and CorrelationDiscrete
into one function, and added option to view correlation of all features at once.
- Optimized layout for multiple plots.
Bug Fixes
- #47: Fixed color scale for correlation heatmap.
DataExplorer 0.4.0
New Features
- #33: Added
PlotStr
to visualize data structure.
- #40: Added network graph to
GenerateReport
.
Bug Fixes
- #32: Fixed pandoc requirement error in unit test on cran.
- #34: Fixed error message when
quiet
is not supplied. In addition, report directory are printed through message()
instead of cat()
.
- #35: Fixed rprojroot not found error.
Enhancements
- #12: Added vignette: dataexplorer-intro.
- #36: Fixed warnings from data.table in
DropVar
.
- #37: Changed all
cat()
to message()
.
- #38: Added option to order bars in
BarDiscrete
.
- #39: Extended
SetNaTo
to discrete features.
- Added more examples to README.md.
DataExplorer 0.3.0
New Features
- #25: Added
SetNaTo
to quickly reset missing numerical values.
- #29: Added
DropVar
to quickly drop variables by either name or column position.
Bug Fixes
- #24:
CorrelationDiscrete
now displays all factor levels instead of full rank matrix from model.matrix
.
Enhancements
- #11: Functions with return values will now match the input class and set it back.
- #22: Added documentation for
num_all_missing
in SplitColType
.
- #23: Added additional measures (in addition to frequency) to
CollapseCategory
.
- #26: Removed density estimation section from report template.
- #31: Added flexibility to name the new category in
CollapseCategory
.
Other notes
- #30: In
CollapseCategory
, update = TRUE
will only work with input data as data.table
. However, it is still possible to view the frequency distribution with any input data class, as long as update = FALSE
.
DataExplorer 0.2.6
Bug Fixes
- #20: Fixed permission denied bug due to intermediates_dir argument in
knitr::render
.
Enhancements
- #16: Improved handling of missing values.
DataExplorer 0.2.5
Bug Fixes
- #18:
GenerateReport
now handles data without discrete or continuous features.
Enhancements
- #14: Updated rmarkdown template for
GenerateReport
.
- #1: Features with all
NA
values will be ignored in BarDiscrete
.
DataExplorer 0.2.4
Bug Fixes
- Fixed a major bug in
GenerateReport
function due to package renaming.
Enhancements
GenerateReport
will now print the directory of the report to console.
DataExplorer 0.2.3
New Features
- Added function
CollapseCategory
to collapse sparse categories for discrete features.
- Added correlation heatmap for both continuous and discrete features.
- Added density plot for continuous features.
Bug Fixes
- Fixed a bug in
BarDiscrete
and CorrelationDiscrete
for not plotting non-factor class.
- Minor changes for CRAN re-submission.
Enhancements
- Changed grid layout for
BarDiscrete
and HistogramContinuous
.
- Features with all missing values will be ignored.
- Switched position between continuous and discrete features in report template.
- Renamed package name to DataExplorer.
- Added NEWS.md.
- Removed
BoxplotContinuous
.