sqllite#
- class predict_backend.ml.nlp.handlers.sqllite.SqlLiteHandler(db_path, uri, narrative_feature, document_identifier, feature_names, components, overwrite_data=True)#
Bases:
PersistenceHandler
Handle data produced by the NLP module. Every component that is SqlLiteHandler is compatible with this PersistenceHandler. This kind of handler might be deprecated soon and contains not implemented methods.
- Parameters:
db_path (
str
) – DB connection stringuri (
bool
) – If the db_path is a URInarrative_feature (
str
) – The feature containing the text.document_identifier (
str
) – The id of the dataset.feature_names (
List
[str
]) – Features you want to save.components (
List
[Type
[SqlCompliant
]]) – List of PandasCompliant class. This way we know of to work with these components.overwrite_data (
bool
) – Whether to overwrite pre-existing data, defaults to True.
- get_doc_data(doc_id)#
- Parameters:
doc_id – Identifier of the doc.
- Return type:
Dict
- Returns:
The base doc data info extracted from the doc plus the columns of the original dataset.
- get_doc_entities(doc_id)#
- Parameters:
doc_id – Identifier of the doc.
- Return type:
DataFrame
- Returns:
The query result applied on the entities table.
- get_doc_events(doc_id)#
- Parameters:
doc_id – Identifier of the doc.
- Return type:
DataFrame
- Returns:
The query result applied on the events table.
- get_doc_ids()#
- Returns:
A list of document ids with the relative ingestion time
- get_table(table)#
- Parameters:
table (
str
) – Name of the table you’re interested in.- Returns:
DataFrame representing the data table produced by a component.
- init_components()#
Init component persistence.
- init_persistence()#
Init the persistence handler.
- initialize_database()#
Init the db with the required tables.
- insert_doc(doc, row_data)#
Insert a doc into the persistence.
- Parameters:
doc (
Doc
) – The spacy doc object to insert into the bufferrow_data – The extra features of the doc
- start_buffered_ingestion()#
Initialize the buffer to speed the ingestion process.
- stop_buffered_ingestion()#
Consume the buffer, merge the data and delete the buffer.