academic_observatory_workflows.scopus_telescope.telescope

Classes

DagParams

param dag_id:

the id of the DAG.

Functions

create_dag(dag_params)

Scopus telescope.

Module Contents

class academic_observatory_workflows.scopus_telescope.telescope.DagParams(*, dag_id: str, cloud_workspace: observatory_platform.airflow.workflow.CloudWorkspace, institution_ids: List[str], scopus_conn_ids: List[str], view: str = 'STANDARD', earliest_date: pendulum.DateTime = pendulum.datetime(1800, 1, 1), bq_dataset_id: str = 'scopus', bq_table_name: str = 'scopus', api_bq_dataset_id: str = 'dataset_api', schema_folder: str = project_path('scopus_telescope', 'schema'), dataset_description: str = 'The Scopus citation database: https://www.scopus.com', table_description: str = 'The Scopus citation database: https://www.scopus.com', start_date: pendulum.DateTime = pendulum.datetime(2018, 5, 14), schedule: str = '@monthly', max_active_runs: int = 1, retries: int = 3)[source]
Parameters:
  • dag_id – the id of the DAG.

  • cloud_workspace – the cloud workspace settings.

  • institution_ids – list of institution IDs to use for the Scopus search query.

  • scopus_conn_ids – list of Scopus Airflow Connection IDs.

  • view – The view type. Standard or complete. See https://dev.elsevier.com/sc_search_views.html

  • earliest_date – earliest date to query for results.

  • bq_dataset_id – the BigQuery dataset id.

  • bq_table_name – the BigQuery table name.

  • api_bq_dataset_id – the Dataset ID to use when storing releases.

  • schema_folder – the SQL schema path.

  • dataset_description – description for the BigQuery dataset.

  • table_description – description for the BigQuery table.

  • observatory_api_conn_id – the Observatory API connection key.

  • start_date – the start date of the DAG.

  • schedule – the schedule interval of the DAG.

  • max_active_runs – the maximum number of DAG runs that can be run at once.

  • retries – the number of times to retry a task.

dag_id[source]
cloud_workspace[source]
institution_ids[source]
scopus_conn_ids[source]
view = 'STANDARD'[source]
earliest_date[source]
bq_dataset_id = 'scopus'[source]
bq_table_name = 'scopus'[source]
api_bq_dataset_id = 'dataset_api'[source]
schema_folder[source]
dataset_description = 'The Scopus citation database: https://www.scopus.com'[source]
table_description = 'The Scopus citation database: https://www.scopus.com'[source]
start_date[source]
schedule = '@monthly'[source]
max_active_runs = 1[source]
retries = 3[source]
academic_observatory_workflows.scopus_telescope.telescope.create_dag(dag_params: DagParams)[source]

Scopus telescope.