data_upload

class virtualitics_sdk.nlp.data_upload.DataUpload(show_model_drop=False)

Bases: Step

corpus_name_textinput_placeholder = ''
corpus_name_textinput_title = '[Optional] Network Graph Name'
corpus_requirements = "In order to use the Virtualitics AI Platform's NLP Pipeline, the uploaded data must meet the following series of requirements:\n - Have at least one column comprised of natural language text (narrative column).\n- Have at least one column that uniquely identifies each row in the data set or each document to be processed (document ID column).\n- The data must be stored as a comma-separated values (.csv) file where the first row consists of column names."
data_source_title = 'Requirements'
get_corpus_name(flow_metadata)

The corpus name parameter is used as final kg output name. It is used once imported into Explore Args:

flow_metadata:

Returns: The value selected by the user or the default one

get_model_or_default(store_interface)

In advanced mode, user can select the spacy model to use from a dropdown list. This method can be called from other steps to get the selected value from the user or the default one. Args:

store_interface:

Returns: user spacy model input or the default one

main_section = 'NLP Dataset Upload'
model_selection_default = 'en_core_web_lg'
model_selection_title = 'Model Selection'
run(flow_metadata)
step_title = 'Dataset Selection'