Data Processor API Reference¶
The EliaDataProcessor
class provides high-level data processing capabilities for working with Elia OpenData datasets.
High-level data processor for Elia OpenData datasets.
This class provides convenient methods for fetching and processing data from the Elia OpenData API. It supports multiple output formats and handles common data retrieval patterns automatically.
The processor can return data in three formats: - JSON: Raw list of dictionaries (default) - Pandas: pandas.DataFrame for data analysis - Polars: polars.DataFrame for high-performance data processing
Attributes:
Name | Type | Description |
---|---|---|
client |
EliaClient
|
The underlying API client for making requests. |
return_type |
str
|
The format for returned data ("json", "pandas", or "polars"). |
Example
Basic usage:
With custom client and return type:
from elia_opendata.client import EliaClient
client = EliaClient(api_key="your_key")
processor = EliaDataProcessor(client=client, return_type="pandas")
df = processor.fetch_current_value("ods032")
print(df.head())
Date range queries:
Parameters:
Name | Type | Description | Default |
---|---|---|---|
client
|
Optional[EliaClient]
|
EliaClient instance for making API requests. If None, a new client with default settings will be created automatically. |
None
|
return_type
|
str
|
Output format for processed data. Must be one of: - "json": Returns raw list of dictionaries (default) - "pandas": Returns pandas.DataFrame - "polars": Returns polars.DataFrame |
'json'
|
Raises:
Type | Description |
---|---|
ValueError
|
If return_type is not one of the supported formats. |
Example
Default initialization:
With custom client:
from elia_opendata.client import EliaClient
client = EliaClient(api_key="your_key", timeout=60)
processor = EliaDataProcessor(client=client)
With pandas output:
fetch_current_value(dataset_id: str, **kwargs) -> Any
¶
Fetch the most recent value from a dataset.
This method retrieves the single most recent record from the specified dataset by automatically setting limit=1 and ordering by datetime in descending order.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_id
|
str
|
Unique identifier for the dataset to query. Use constants from dataset_catalog module (e.g., TOTAL_LOAD). |
required |
**kwargs
|
Additional query parameters to pass to the API: - where: Filter condition in OData format - select: Comma-separated list of fields to retrieve - Any other parameters supported by the API |
{}
|
Returns:
Type | Description |
---|---|
Any
|
The most recent record(s) in the format specified by return_type: |
Any
|
|
Any
|
|
Any
|
|
fetch_data_between(dataset_id: str, start_date: Union[str, datetime], end_date: Union[str, datetime], **kwargs) -> Any
¶
Fetch data between two dates with automatic pagination.
This method retrieves all records from the specified dataset within the given date range. It supports two modes: 1. Pagination mode (default): Uses multiple API requests with pagination 2. Export mode: Uses the bulk export endpoint for large datasets
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_id
|
str
|
Unique identifier for the dataset to query. Use constants from dataset_catalog module. |
required |
start_date
|
Union[str, datetime]
|
Start date for the query range. Can be either: - datetime object - ISO date string (e.g., "2023-01-01") |
required |
end_date
|
Union[str, datetime]
|
End date for the query range. Can be either: - datetime object - ISO date string (e.g., "2023-01-31") |
required |
**kwargs
|
Additional query parameters: - export_data (bool): If True, uses the export endpoint for bulk data retrieval. If False (default), uses pagination. - where: Additional filter conditions (combined with date filter) - limit: Batch size for pagination (default: 100) or maximum records for export - order_by: Sort order for results - select: Comma-separated fields to retrieve - Any other API-supported parameters |
{}
|
Returns:
Type | Description |
---|---|
Any
|
All matching records in the format specified by return_type: |
Any
|
|
Any
|
|
Any
|
|
Note
For large date ranges (>10,000 records), consider setting export_data=True to use the more efficient export endpoint. The export endpoint automatically uses the optimal format: - JSON for json return_type - Parquet for pandas/polars return_types
Example
Fetch data for January 2023:
from datetime import datetime
from elia_opendata.dataset_catalog import TOTAL_LOAD
processor = EliaDataProcessor()
start = datetime(2023, 1, 1)
end = datetime(2023, 1, 31)
data = processor.fetch_data_between(TOTAL_LOAD, start, end)
print(f"Retrieved {len(data)} records")
Using export endpoint for large datasets:
With string dates:
With additional filtering:
measured_data = processor.fetch_data_between(
TOTAL_LOAD,
start,
end,
where="type='measured'",
limit=500 # Larger batch size
)
As pandas DataFrame: