I did come across a unique file format “DSV”. How do i read this file in python ? what is the process to convert this file into “csv” format.
Hi @md.vasim26:
I have not heard of this types of files before but you can try downloading this package here.
dsv2csv
Takes DSV on standard input, and outputs in Excel-format CSV (Python’s default dialect) on stdout.
cat /etc/passwd | dsv2csv > passwd.csv
Remember that this involves downloading the tool as mentioned above.
Hope this helps!
can you show us what it contain few sample rows to check its delimiter?
Attached this file, This one sample. As its large file I don’t want to read each line and remove (|*|) then write again
@md.vasim26: this is before u store the data into the dsv? Or did you manage to read it? If you did you could try:
import pandas as pd
clean_rows = []
for row in dataset:
clean_data = []
for data in row:
data = data.split('|*|')
clean_data.append(data)
clean_rows.append(row)
clean_rows_df = pd.Dataframe(clean_rows)
clean_rows_df.to_csv("file.csv")
Since its a dsv
file I doubt you can use the read_csv
method.
Hope this helps!
You can define sep
while reading file using pd.read_csv
like
import pandas as pd
df = pd.read_csv("file_name", sep="\|\*\|") # read file
df.to_csv("file_name_csv") # convert to csv
It works…I have tried this earlier but provided “sep” incorrectly.