Skip to content

Commit 9f81c77

Browse files
authored
chore: clean up data cleaning residue (GoogleCloudPlatform#7188)
1 parent 3c3016b commit 9f81c77

File tree

2 files changed

+3
-3
lines changed

2 files changed

+3
-3
lines changed
Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
1-
# Data Science Onramp - Data cleaning
1+
# Data Science Onramp - Data processing
22

33
## Testing Data
44

55
- [Citibike trips](https://pantheon.corp.google.com/bigquery?p=bigquery-public-data&d=new_york_citibike&t=citibike_trips&project=data-science-onramp&folder=&organizationId=) from BigQuery public dataset
66
- Output of data ingestion code
77
- Differences from original dataset: the dataset was purposely made dirty so that it could be cleaned for a data science tutorial
8-
- License: Public Domain
8+
- License: Public Domain

data-science-onramp/data-processing/process.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -160,7 +160,7 @@ def compute_end_udf(duration, start, end):
160160
TABLE = sys.argv[2]
161161

162162
# Create a SparkSession, viewable via the Spark UI
163-
spark = SparkSession.builder.appName("data_cleaning").getOrCreate()
163+
spark = SparkSession.builder.appName("data_processing").getOrCreate()
164164

165165
# Load data into dataframe if table exists
166166
try:

0 commit comments

Comments
 (0)