FileDataSource.get_dataframe(args_dict) should also provide simple equality-based selection #96

schuderer · 2020-03-31T16:52:09Z

Description

Right now, DBMS-style DataSources allow for querying using parameters provided through args_dict. The most common case is probably providing a value to check equality with (e.g. an ID for an item to fetch).

Although DataSources claim to be isomorphic towards model code, in the case of the FileDataSource, one would have to write specific code to select the desired record from the loaded CSV.

To make this claim somewhat more true (and usage more consistent between kinds of DataSources, at least for the equality case), I propose to add functionality to FileDataSource.get_dataframe to use the args_dict parameter for selection. args_dict would be a dictionary of column key(s) with values to equality-test. get_dataframe would return a subset of the originally loaded DataFrame.

Other comments

If there is a clean, reasonably fast, pandas-supported way to do this on CSVs without loading them into memory first, this would be preferrable to first loading all data, then filtering. Maybe this is relevant: https://stackoverflow.com/questions/13651117/how-can-i-filter-lines-on-load-in-pandas-read-csv-function

The text was updated successfully, but these errors were encountered:

schuderer · 2020-05-13T08:45:53Z

A mapping between "param name" and column name would also be nice -- otherwise, if one wants to keep lookup behaviour consistent between a database and csv source, one would have to change the parameter name in the sql to the corresponding column name

schuderer added the enhancement New feature or request label Mar 31, 2020

schuderer mentioned this issue Mar 31, 2020

DBMS-DataSources should use memoization for queries with :params #97

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FileDataSource.get_dataframe(args_dict) should also provide simple equality-based selection #96

FileDataSource.get_dataframe(args_dict) should also provide simple equality-based selection #96

schuderer commented Mar 31, 2020 •

edited

Loading

schuderer commented May 13, 2020

FileDataSource.get_dataframe(args_dict) should also provide simple equality-based selection #96

FileDataSource.get_dataframe(args_dict) should also provide simple equality-based selection #96

Comments

schuderer commented Mar 31, 2020 • edited Loading

Description

Other comments

schuderer commented May 13, 2020

schuderer commented Mar 31, 2020 •

edited

Loading