watex.utils.funcutils.to_hdf5#
- watex.utils.funcutils.to_hdf5(d, /, fn, objname=None, close=True, **hdf5_kws)[source]#
Store a frame data in hierachical data format 5 (HDF5)
Note that is d is a dataframe, make sure that the dependency ‘pytables’ is already installed, otherwise and error raises.
- Parameters:
d (ndarray,) – data to store in HDF5 format
fn (str,) – File path to HDF5 file.
objname (str,) – name of the data to store
close (bool, default =True) – when data is given as an array, data can still be added if close is set to
False, otherwise, users need to open again in read mode ‘r’ before pursuing the process of adding.hdf5_kws (dict of
pandas.pd.HDFStore) –Additional keywords arguments passed to pd.HDFStore. they could be: * mode : {‘a’, ‘w’, ‘r’, ‘r+’}, default ‘a’
'r'Read-only; no data can be modified.
'w'Write; a new file is created (an existing file with the same name would be deleted).
'a'Append; an existing file is opened for reading and writing, and if the file does not exist it is created.
'r+'It is similar to
'a', but the file must already exist.
- complevelint, 0-9, default None
Specifies a compression level for data. A value of 0 or None disables compression.
- complib{‘zlib’, ‘lzo’, ‘bzip2’, ‘blosc’}, default ‘zlib’
Specifies the compression library to be used. As of v0.20.2 these additional compressors for Blosc are supported (default if no compressor specified: ‘blosc:blosclz’): {‘blosc:blosclz’, ‘blosc:lz4’, ‘blosc:lz4hc’, ‘blosc:snappy’,
’blosc:zlib’, ‘blosc:zstd’}.
Specifying a compression library which is not available issues a ValueError.
- fletcher32bool, default False
If applying compression use the fletcher32 checksum.
- Returns:
store
- Return type:
Dict-like IO interface for storing pandas objects.
Examples
>>> import os >>> from watex.utils.funcutils import sanitize_frame_cols, to_hdf5 >>> from watex.utils import read_data >>> data = read_data('data/boreholes/H502.xlsx') >>> sanitize_frame_cols (data, fill_pattern='_', inplace =True ) >>> store_path = os.path.join('watex/datasets/data', 'h') # 'h' is the name of the data >>> store = to_hdf5 (data, fn =store_path , objname ='h502' ) >>> store ... >>> # fetch the data >>> h502 = store ['h502'] >>> h502.columns[:3] ... Index(['hole_number', 'depth_top', 'depth_bottom'], dtype='object')