watex.utils.funcutils.to_hdf5#

watex.utils.funcutils.to_hdf5(d, /, fn, objname=None, close=True, **hdf5_kws)[source]#

Store a frame data in hierachical data format 5 (HDF5)

Note that is d is a dataframe, make sure that the dependency ‘pytables’ is already installed, otherwise and error raises.

Parameters:
  • d (ndarray,) – data to store in HDF5 format

  • fn (str,) – File path to HDF5 file.

  • objname (str,) – name of the data to store

  • close (bool, default =True) – when data is given as an array, data can still be added if close is set to False, otherwise, users need to open again in read mode ‘r’ before pursuing the process of adding.

  • hdf5_kws (dict of pandas.pd.HDFStore) –

    Additional keywords arguments passed to pd.HDFStore. they could be: * mode : {‘a’, ‘w’, ‘r’, ‘r+’}, default ‘a’

    'r'

    Read-only; no data can be modified.

    'w'

    Write; a new file is created (an existing file with the same name would be deleted).

    'a'

    Append; an existing file is opened for reading and writing, and if the file does not exist it is created.

    'r+'

    It is similar to 'a', but the file must already exist.

    • complevelint, 0-9, default None

      Specifies a compression level for data. A value of 0 or None disables compression.

    • complib{‘zlib’, ‘lzo’, ‘bzip2’, ‘blosc’}, default ‘zlib’

      Specifies the compression library to be used. As of v0.20.2 these additional compressors for Blosc are supported (default if no compressor specified: ‘blosc:blosclz’): {‘blosc:blosclz’, ‘blosc:lz4’, ‘blosc:lz4hc’, ‘blosc:snappy’,

      ’blosc:zlib’, ‘blosc:zstd’}.

      Specifying a compression library which is not available issues a ValueError.

    • fletcher32bool, default False

      If applying compression use the fletcher32 checksum.

Returns:

store

Return type:

Dict-like IO interface for storing pandas objects.

Examples

>>> import os
>>> from watex.utils.funcutils import sanitize_frame_cols, to_hdf5
>>> from watex.utils import read_data
>>> data = read_data('data/boreholes/H502.xlsx')
>>> sanitize_frame_cols (data, fill_pattern='_', inplace =True )
>>> store_path = os.path.join('watex/datasets/data', 'h') # 'h' is the name of the data
>>> store = to_hdf5 (data, fn =store_path , objname ='h502' )
>>> store
...
>>> # fetch the data
>>> h502 = store ['h502']
>>> h502.columns[:3]
... Index(['hole_number', 'depth_top', 'depth_bottom'], dtype='object')