Python analyzing and changing data values
up vote
1
down vote
favorite
I have tried to make my program work, but as I don't have a lot of experience, I have faced a lot of problems. The idea of my work is that I analyze the data of the csv file and change the negative values to 0.
So far I have managed to get all the data from the file and save it to a list, and manually assigning the "start" and "end" times of the file.
import csv
data =
filename = str(input("Give the file name: "))
with open(filename) as csvfile:
spamreader = csv.reader(csvfile, delimiter=';')
for row in spamreader:
data.append(row)
print("File", filename, "is opened,", len(data),"rows will be analyzed.")
data.pop(0) #deletes the header
print (f'Analyzing data between dates {data[0][0]} and {data[1097][0]}.') #data[row][column]
How is it possible to change the negative values in the data (screenshot) to 0, and then save everything to a new csv file? Help would be massivly appreciated.
screenshot inside the CSV file
[In the actual data file there is 1000+ rows]
python csv
add a comment |
up vote
1
down vote
favorite
I have tried to make my program work, but as I don't have a lot of experience, I have faced a lot of problems. The idea of my work is that I analyze the data of the csv file and change the negative values to 0.
So far I have managed to get all the data from the file and save it to a list, and manually assigning the "start" and "end" times of the file.
import csv
data =
filename = str(input("Give the file name: "))
with open(filename) as csvfile:
spamreader = csv.reader(csvfile, delimiter=';')
for row in spamreader:
data.append(row)
print("File", filename, "is opened,", len(data),"rows will be analyzed.")
data.pop(0) #deletes the header
print (f'Analyzing data between dates {data[0][0]} and {data[1097][0]}.') #data[row][column]
How is it possible to change the negative values in the data (screenshot) to 0, and then save everything to a new csv file? Help would be massivly appreciated.
screenshot inside the CSV file
[In the actual data file there is 1000+ rows]
python csv
I added the data file for download
– Koppis
Nov 17 at 21:52
1
As you start your question with "make my program work": is it pure coincidence that your screenshot and data file are the same as in stackoverflow.com/q/53325573/2564301 and "your work" bears more than a passing resemblance to my answer there?
– usr2564301
Nov 17 at 21:57
We are friends. I didn't find it necessary to mention it, but we have divided the work so we get it made faster and we can teach each other. So I'm not copying anothers idea or code for no reason.
– Koppis
Nov 17 at 22:07
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I have tried to make my program work, but as I don't have a lot of experience, I have faced a lot of problems. The idea of my work is that I analyze the data of the csv file and change the negative values to 0.
So far I have managed to get all the data from the file and save it to a list, and manually assigning the "start" and "end" times of the file.
import csv
data =
filename = str(input("Give the file name: "))
with open(filename) as csvfile:
spamreader = csv.reader(csvfile, delimiter=';')
for row in spamreader:
data.append(row)
print("File", filename, "is opened,", len(data),"rows will be analyzed.")
data.pop(0) #deletes the header
print (f'Analyzing data between dates {data[0][0]} and {data[1097][0]}.') #data[row][column]
How is it possible to change the negative values in the data (screenshot) to 0, and then save everything to a new csv file? Help would be massivly appreciated.
screenshot inside the CSV file
[In the actual data file there is 1000+ rows]
python csv
I have tried to make my program work, but as I don't have a lot of experience, I have faced a lot of problems. The idea of my work is that I analyze the data of the csv file and change the negative values to 0.
So far I have managed to get all the data from the file and save it to a list, and manually assigning the "start" and "end" times of the file.
import csv
data =
filename = str(input("Give the file name: "))
with open(filename) as csvfile:
spamreader = csv.reader(csvfile, delimiter=';')
for row in spamreader:
data.append(row)
print("File", filename, "is opened,", len(data),"rows will be analyzed.")
data.pop(0) #deletes the header
print (f'Analyzing data between dates {data[0][0]} and {data[1097][0]}.') #data[row][column]
How is it possible to change the negative values in the data (screenshot) to 0, and then save everything to a new csv file? Help would be massivly appreciated.
screenshot inside the CSV file
[In the actual data file there is 1000+ rows]
python csv
python csv
edited Nov 18 at 15:00
asked Nov 17 at 20:20
Koppis
386
386
I added the data file for download
– Koppis
Nov 17 at 21:52
1
As you start your question with "make my program work": is it pure coincidence that your screenshot and data file are the same as in stackoverflow.com/q/53325573/2564301 and "your work" bears more than a passing resemblance to my answer there?
– usr2564301
Nov 17 at 21:57
We are friends. I didn't find it necessary to mention it, but we have divided the work so we get it made faster and we can teach each other. So I'm not copying anothers idea or code for no reason.
– Koppis
Nov 17 at 22:07
add a comment |
I added the data file for download
– Koppis
Nov 17 at 21:52
1
As you start your question with "make my program work": is it pure coincidence that your screenshot and data file are the same as in stackoverflow.com/q/53325573/2564301 and "your work" bears more than a passing resemblance to my answer there?
– usr2564301
Nov 17 at 21:57
We are friends. I didn't find it necessary to mention it, but we have divided the work so we get it made faster and we can teach each other. So I'm not copying anothers idea or code for no reason.
– Koppis
Nov 17 at 22:07
I added the data file for download
– Koppis
Nov 17 at 21:52
I added the data file for download
– Koppis
Nov 17 at 21:52
1
1
As you start your question with "make my program work": is it pure coincidence that your screenshot and data file are the same as in stackoverflow.com/q/53325573/2564301 and "your work" bears more than a passing resemblance to my answer there?
– usr2564301
Nov 17 at 21:57
As you start your question with "make my program work": is it pure coincidence that your screenshot and data file are the same as in stackoverflow.com/q/53325573/2564301 and "your work" bears more than a passing resemblance to my answer there?
– usr2564301
Nov 17 at 21:57
We are friends. I didn't find it necessary to mention it, but we have divided the work so we get it made faster and we can teach each other. So I'm not copying anothers idea or code for no reason.
– Koppis
Nov 17 at 22:07
We are friends. I didn't find it necessary to mention it, but we have divided the work so we get it made faster and we can teach each other. So I'm not copying anothers idea or code for no reason.
– Koppis
Nov 17 at 22:07
add a comment |
2 Answers
2
active
oldest
votes
up vote
1
down vote
accepted
If your file contains only numbers you can try the code below:
import csv
from collections import defaultdict
def convert(item):
try:
item = float(item)
if item < 0:
return 0
else:
return item
except ValueError:
return item
sums = defaultdict(list)
with open('Data.csv', 'r') as inp, open('output.csv', 'w', newline = '') as outp:
reader = csv.reader(inp, delimiter = ';')
writer = csv.writer(outp, delimiter = ';', dialect = 'excel')
headers = next(reader)
for line in reader:
line = [convert(i) for i in line]
sums[line[0]].append(line[1])
writer.writerow(line)
for k,v in sums.items():
print('key: {} total: {}'.format(k,sum(v)))
Firstly you can use the with
context manager to open both files that you are going to read and write, from and to respectively. Then instead of storing it into a list, you can transform your data and directly write it to your target file, here named output.csv
. The for line in reader
here loops through the lines and transforms each value into an absolute integer before writing it to the file.
I added the defaultdict
from the Python standard library if you want to store the values into a dictionary to use them. The dictionary sums
will include a key by date and its values. To print the sum by days, you can simply use a string format to print the key and the summed values.
The file not only integers
– Koppis
Nov 17 at 21:58
I added a helper function to check if it could be converted before turning it into an absolute value else it will just return the item as it is.
– BernardL
Nov 17 at 22:07
I don't really understand what your code does? The output (numers) are the same as in the input file, and there is extra spaces in between the cells.
– Koppis
Nov 17 at 22:13
Edited my answer based on your file, it was because of the delimiters and the newline settings for windows. Try and it and tell me what it is you do not understand.
– BernardL
Nov 17 at 22:24
That's perfect, thank you very much! If I'd want to sum all the values per day, how should it be done? So it would be 1.1.2016 369 416 2.1.2016 4 819 etc.. Would this be easy to do?
– Koppis
Nov 17 at 22:52
|
show 5 more comments
up vote
0
down vote
This is very easily achieved using a pandas DataFrame:
import pandas as pd
c = pd.read_csv('inputcsvfile.csv')
processed_csv = c[c > 0].fillna(value=0)
processed_csv.to_csv('outputcsvfile.csv')
Pandas is an external library so you would need to install it (pip install pandas
) before you can use this solution
I have heard about panda, unfortunately this should be done in 'vanilla' python, so can't install external libraries.
– Koppis
Nov 17 at 23:02
1
Is this an assignment?
– Mark
Nov 17 at 23:03
It's not an assignment, this is just to analyze a lot amount of data to use with my friends. So it would be best to use the existing libraries. Although I'm thinking of Pandas, as it makes life so much easier with data as you showed tour code
– Koppis
Nov 17 at 23:10
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
If your file contains only numbers you can try the code below:
import csv
from collections import defaultdict
def convert(item):
try:
item = float(item)
if item < 0:
return 0
else:
return item
except ValueError:
return item
sums = defaultdict(list)
with open('Data.csv', 'r') as inp, open('output.csv', 'w', newline = '') as outp:
reader = csv.reader(inp, delimiter = ';')
writer = csv.writer(outp, delimiter = ';', dialect = 'excel')
headers = next(reader)
for line in reader:
line = [convert(i) for i in line]
sums[line[0]].append(line[1])
writer.writerow(line)
for k,v in sums.items():
print('key: {} total: {}'.format(k,sum(v)))
Firstly you can use the with
context manager to open both files that you are going to read and write, from and to respectively. Then instead of storing it into a list, you can transform your data and directly write it to your target file, here named output.csv
. The for line in reader
here loops through the lines and transforms each value into an absolute integer before writing it to the file.
I added the defaultdict
from the Python standard library if you want to store the values into a dictionary to use them. The dictionary sums
will include a key by date and its values. To print the sum by days, you can simply use a string format to print the key and the summed values.
The file not only integers
– Koppis
Nov 17 at 21:58
I added a helper function to check if it could be converted before turning it into an absolute value else it will just return the item as it is.
– BernardL
Nov 17 at 22:07
I don't really understand what your code does? The output (numers) are the same as in the input file, and there is extra spaces in between the cells.
– Koppis
Nov 17 at 22:13
Edited my answer based on your file, it was because of the delimiters and the newline settings for windows. Try and it and tell me what it is you do not understand.
– BernardL
Nov 17 at 22:24
That's perfect, thank you very much! If I'd want to sum all the values per day, how should it be done? So it would be 1.1.2016 369 416 2.1.2016 4 819 etc.. Would this be easy to do?
– Koppis
Nov 17 at 22:52
|
show 5 more comments
up vote
1
down vote
accepted
If your file contains only numbers you can try the code below:
import csv
from collections import defaultdict
def convert(item):
try:
item = float(item)
if item < 0:
return 0
else:
return item
except ValueError:
return item
sums = defaultdict(list)
with open('Data.csv', 'r') as inp, open('output.csv', 'w', newline = '') as outp:
reader = csv.reader(inp, delimiter = ';')
writer = csv.writer(outp, delimiter = ';', dialect = 'excel')
headers = next(reader)
for line in reader:
line = [convert(i) for i in line]
sums[line[0]].append(line[1])
writer.writerow(line)
for k,v in sums.items():
print('key: {} total: {}'.format(k,sum(v)))
Firstly you can use the with
context manager to open both files that you are going to read and write, from and to respectively. Then instead of storing it into a list, you can transform your data and directly write it to your target file, here named output.csv
. The for line in reader
here loops through the lines and transforms each value into an absolute integer before writing it to the file.
I added the defaultdict
from the Python standard library if you want to store the values into a dictionary to use them. The dictionary sums
will include a key by date and its values. To print the sum by days, you can simply use a string format to print the key and the summed values.
The file not only integers
– Koppis
Nov 17 at 21:58
I added a helper function to check if it could be converted before turning it into an absolute value else it will just return the item as it is.
– BernardL
Nov 17 at 22:07
I don't really understand what your code does? The output (numers) are the same as in the input file, and there is extra spaces in between the cells.
– Koppis
Nov 17 at 22:13
Edited my answer based on your file, it was because of the delimiters and the newline settings for windows. Try and it and tell me what it is you do not understand.
– BernardL
Nov 17 at 22:24
That's perfect, thank you very much! If I'd want to sum all the values per day, how should it be done? So it would be 1.1.2016 369 416 2.1.2016 4 819 etc.. Would this be easy to do?
– Koppis
Nov 17 at 22:52
|
show 5 more comments
up vote
1
down vote
accepted
up vote
1
down vote
accepted
If your file contains only numbers you can try the code below:
import csv
from collections import defaultdict
def convert(item):
try:
item = float(item)
if item < 0:
return 0
else:
return item
except ValueError:
return item
sums = defaultdict(list)
with open('Data.csv', 'r') as inp, open('output.csv', 'w', newline = '') as outp:
reader = csv.reader(inp, delimiter = ';')
writer = csv.writer(outp, delimiter = ';', dialect = 'excel')
headers = next(reader)
for line in reader:
line = [convert(i) for i in line]
sums[line[0]].append(line[1])
writer.writerow(line)
for k,v in sums.items():
print('key: {} total: {}'.format(k,sum(v)))
Firstly you can use the with
context manager to open both files that you are going to read and write, from and to respectively. Then instead of storing it into a list, you can transform your data and directly write it to your target file, here named output.csv
. The for line in reader
here loops through the lines and transforms each value into an absolute integer before writing it to the file.
I added the defaultdict
from the Python standard library if you want to store the values into a dictionary to use them. The dictionary sums
will include a key by date and its values. To print the sum by days, you can simply use a string format to print the key and the summed values.
If your file contains only numbers you can try the code below:
import csv
from collections import defaultdict
def convert(item):
try:
item = float(item)
if item < 0:
return 0
else:
return item
except ValueError:
return item
sums = defaultdict(list)
with open('Data.csv', 'r') as inp, open('output.csv', 'w', newline = '') as outp:
reader = csv.reader(inp, delimiter = ';')
writer = csv.writer(outp, delimiter = ';', dialect = 'excel')
headers = next(reader)
for line in reader:
line = [convert(i) for i in line]
sums[line[0]].append(line[1])
writer.writerow(line)
for k,v in sums.items():
print('key: {} total: {}'.format(k,sum(v)))
Firstly you can use the with
context manager to open both files that you are going to read and write, from and to respectively. Then instead of storing it into a list, you can transform your data and directly write it to your target file, here named output.csv
. The for line in reader
here loops through the lines and transforms each value into an absolute integer before writing it to the file.
I added the defaultdict
from the Python standard library if you want to store the values into a dictionary to use them. The dictionary sums
will include a key by date and its values. To print the sum by days, you can simply use a string format to print the key and the summed values.
edited Nov 17 at 23:13
answered Nov 17 at 21:06
BernardL
1,740828
1,740828
The file not only integers
– Koppis
Nov 17 at 21:58
I added a helper function to check if it could be converted before turning it into an absolute value else it will just return the item as it is.
– BernardL
Nov 17 at 22:07
I don't really understand what your code does? The output (numers) are the same as in the input file, and there is extra spaces in between the cells.
– Koppis
Nov 17 at 22:13
Edited my answer based on your file, it was because of the delimiters and the newline settings for windows. Try and it and tell me what it is you do not understand.
– BernardL
Nov 17 at 22:24
That's perfect, thank you very much! If I'd want to sum all the values per day, how should it be done? So it would be 1.1.2016 369 416 2.1.2016 4 819 etc.. Would this be easy to do?
– Koppis
Nov 17 at 22:52
|
show 5 more comments
The file not only integers
– Koppis
Nov 17 at 21:58
I added a helper function to check if it could be converted before turning it into an absolute value else it will just return the item as it is.
– BernardL
Nov 17 at 22:07
I don't really understand what your code does? The output (numers) are the same as in the input file, and there is extra spaces in between the cells.
– Koppis
Nov 17 at 22:13
Edited my answer based on your file, it was because of the delimiters and the newline settings for windows. Try and it and tell me what it is you do not understand.
– BernardL
Nov 17 at 22:24
That's perfect, thank you very much! If I'd want to sum all the values per day, how should it be done? So it would be 1.1.2016 369 416 2.1.2016 4 819 etc.. Would this be easy to do?
– Koppis
Nov 17 at 22:52
The file not only integers
– Koppis
Nov 17 at 21:58
The file not only integers
– Koppis
Nov 17 at 21:58
I added a helper function to check if it could be converted before turning it into an absolute value else it will just return the item as it is.
– BernardL
Nov 17 at 22:07
I added a helper function to check if it could be converted before turning it into an absolute value else it will just return the item as it is.
– BernardL
Nov 17 at 22:07
I don't really understand what your code does? The output (numers) are the same as in the input file, and there is extra spaces in between the cells.
– Koppis
Nov 17 at 22:13
I don't really understand what your code does? The output (numers) are the same as in the input file, and there is extra spaces in between the cells.
– Koppis
Nov 17 at 22:13
Edited my answer based on your file, it was because of the delimiters and the newline settings for windows. Try and it and tell me what it is you do not understand.
– BernardL
Nov 17 at 22:24
Edited my answer based on your file, it was because of the delimiters and the newline settings for windows. Try and it and tell me what it is you do not understand.
– BernardL
Nov 17 at 22:24
That's perfect, thank you very much! If I'd want to sum all the values per day, how should it be done? So it would be 1.1.2016 369 416 2.1.2016 4 819 etc.. Would this be easy to do?
– Koppis
Nov 17 at 22:52
That's perfect, thank you very much! If I'd want to sum all the values per day, how should it be done? So it would be 1.1.2016 369 416 2.1.2016 4 819 etc.. Would this be easy to do?
– Koppis
Nov 17 at 22:52
|
show 5 more comments
up vote
0
down vote
This is very easily achieved using a pandas DataFrame:
import pandas as pd
c = pd.read_csv('inputcsvfile.csv')
processed_csv = c[c > 0].fillna(value=0)
processed_csv.to_csv('outputcsvfile.csv')
Pandas is an external library so you would need to install it (pip install pandas
) before you can use this solution
I have heard about panda, unfortunately this should be done in 'vanilla' python, so can't install external libraries.
– Koppis
Nov 17 at 23:02
1
Is this an assignment?
– Mark
Nov 17 at 23:03
It's not an assignment, this is just to analyze a lot amount of data to use with my friends. So it would be best to use the existing libraries. Although I'm thinking of Pandas, as it makes life so much easier with data as you showed tour code
– Koppis
Nov 17 at 23:10
add a comment |
up vote
0
down vote
This is very easily achieved using a pandas DataFrame:
import pandas as pd
c = pd.read_csv('inputcsvfile.csv')
processed_csv = c[c > 0].fillna(value=0)
processed_csv.to_csv('outputcsvfile.csv')
Pandas is an external library so you would need to install it (pip install pandas
) before you can use this solution
I have heard about panda, unfortunately this should be done in 'vanilla' python, so can't install external libraries.
– Koppis
Nov 17 at 23:02
1
Is this an assignment?
– Mark
Nov 17 at 23:03
It's not an assignment, this is just to analyze a lot amount of data to use with my friends. So it would be best to use the existing libraries. Although I'm thinking of Pandas, as it makes life so much easier with data as you showed tour code
– Koppis
Nov 17 at 23:10
add a comment |
up vote
0
down vote
up vote
0
down vote
This is very easily achieved using a pandas DataFrame:
import pandas as pd
c = pd.read_csv('inputcsvfile.csv')
processed_csv = c[c > 0].fillna(value=0)
processed_csv.to_csv('outputcsvfile.csv')
Pandas is an external library so you would need to install it (pip install pandas
) before you can use this solution
This is very easily achieved using a pandas DataFrame:
import pandas as pd
c = pd.read_csv('inputcsvfile.csv')
processed_csv = c[c > 0].fillna(value=0)
processed_csv.to_csv('outputcsvfile.csv')
Pandas is an external library so you would need to install it (pip install pandas
) before you can use this solution
answered Nov 17 at 22:46
Mark
548
548
I have heard about panda, unfortunately this should be done in 'vanilla' python, so can't install external libraries.
– Koppis
Nov 17 at 23:02
1
Is this an assignment?
– Mark
Nov 17 at 23:03
It's not an assignment, this is just to analyze a lot amount of data to use with my friends. So it would be best to use the existing libraries. Although I'm thinking of Pandas, as it makes life so much easier with data as you showed tour code
– Koppis
Nov 17 at 23:10
add a comment |
I have heard about panda, unfortunately this should be done in 'vanilla' python, so can't install external libraries.
– Koppis
Nov 17 at 23:02
1
Is this an assignment?
– Mark
Nov 17 at 23:03
It's not an assignment, this is just to analyze a lot amount of data to use with my friends. So it would be best to use the existing libraries. Although I'm thinking of Pandas, as it makes life so much easier with data as you showed tour code
– Koppis
Nov 17 at 23:10
I have heard about panda, unfortunately this should be done in 'vanilla' python, so can't install external libraries.
– Koppis
Nov 17 at 23:02
I have heard about panda, unfortunately this should be done in 'vanilla' python, so can't install external libraries.
– Koppis
Nov 17 at 23:02
1
1
Is this an assignment?
– Mark
Nov 17 at 23:03
Is this an assignment?
– Mark
Nov 17 at 23:03
It's not an assignment, this is just to analyze a lot amount of data to use with my friends. So it would be best to use the existing libraries. Although I'm thinking of Pandas, as it makes life so much easier with data as you showed tour code
– Koppis
Nov 17 at 23:10
It's not an assignment, this is just to analyze a lot amount of data to use with my friends. So it would be best to use the existing libraries. Although I'm thinking of Pandas, as it makes life so much easier with data as you showed tour code
– Koppis
Nov 17 at 23:10
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53355207%2fpython-analyzing-and-changing-data-values%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
I added the data file for download
– Koppis
Nov 17 at 21:52
1
As you start your question with "make my program work": is it pure coincidence that your screenshot and data file are the same as in stackoverflow.com/q/53325573/2564301 and "your work" bears more than a passing resemblance to my answer there?
– usr2564301
Nov 17 at 21:57
We are friends. I didn't find it necessary to mention it, but we have divided the work so we get it made faster and we can teach each other. So I'm not copying anothers idea or code for no reason.
– Koppis
Nov 17 at 22:07