watex.utils.read_data#

watex.utils.read_data(f, sanitize=Ellipsis, reset_index=Ellipsis, comments='#', delimiter=None, columns=None, npz_objkey=None, verbose=Ellipsis, **read_kws)[source]#

Assert and read specific files and url allowed by the package

Readable files are systematically convert to a data frame.

Parameters
  • f (str, Path-like object) – File path or Pathlib object. Must contain a valid file name and should be a readable file or url

  • sanitize (bool, default=False,) –

    Push a minimum sanitization of the data such as:
    • replace a non-alphabetic column items with a pattern ‘_’

    • cast data values to numeric if applicable

    • drop full NaN columns and rows in the data

  • reset_index (bool, default=False,) –

    Reset index if full NaN columns are dropped after sanitization.

    New in version 0.2.5: Apply minimum data sanitization after reading data.

  • comments (str or sequence of str or None, default='#') – The characters or list of characters used to indicate the start of a comment. None implies no comments. For backwards compatibility, byte strings will be decoded as ‘latin1’.

  • delimiter (str, optional) – The character used to separate the values. For backwards compatibility, byte strings will be decoded as ‘latin1’. The default is whitespace.

  • npz_objkey (str, optional) –

    Dataset key to indentify array in multiples array storages in ‘.npz’ format. If key is not set during ‘npz’ storage, arr_0 should be used.

    New in version 0.2.7: Capable to read text and numpy formats (‘.npy’ and ‘.npz’) data. Note that when data is stored in compressed “.npz” format, provided the ‘.npz’ object key as argument of parameter npz_objkey. If None, only the first array should be read and npz_objkey='arr_0'.

  • verbose (bool, default=0) – Outputs message for user guide.

  • read_kws (dict,) – Additional keywords arguments passed to pandas readable file keywords.

Returns

f – A dataframe with head contents by default.

Return type

pandas.DataFrame

See also

np.loadtxt

load text file.

np.load

Load uncompressed or compressed numpy .npy and .npz formats.

watex.utils.baseutils.save_or_load

Save or load numpy arrays.