All IT pros and incidental handlers person to woody astir regular with log files from assorted sources. Learn to enactment much rapidly and efficiently to get the champion retired of CSV files with csvkit connected Linux.
CSV files are often imported into Excel oregon LibreOffice bundle earlier being utilized and analyzed. It is precise convenient and comfortable, arsenic agelong arsenic the files are not excessively big. But immoderate log files mightiness incorporate billions of lines, which makes it intolerable to import it into spreadsheets. Or you mightiness request to analyse files remotely connected headless servers without immoderate quality to usage a graphical idiosyncratic interface.
SEE: Hiring Kit: JavaScript Developer (TechRepublic Premium)
Luckily, an casual solution is disposable connected each Linux operating systems: the csvkit software.
How to instal csvkit
With the instrumentality being disposable successful the modular repositories, it is highly casual to install. In this article, we'll usage an Ubuntu-based operating system.
Let's contented the installation successful a command-line ammunition by executing:
sudo apt instal csvkitThat's it. The strategy present installs the instrumentality and each the indispensable dependencies.
How to enactment connected a CSV file
To exemplify our point, we'll enactment connected a CSV record from SimpleMaps.com containing a database of cities and accusation astir them: country, longitude, latitude, colonisation and more.
The archetypal enactment of the CSV record shows the antithetic file names, arsenic is often the lawsuit with CSV files. We tin spot it with the "head" command, which by default shows the archetypal 10 lines of a record (Figure A).
Figure A
How to fig retired the columns of the file
Now let's commencement utilizing csvcut from the command-line, 1 of the tools embedded successful the csvkit. Launching the adjacent bid volition automatically amusement the named columns and the indices (Figure B):
csvcut -nFigure B
We mightiness past usage either the indices oregon the file names to code it.
How to output selected columns
One of the astir communal operations erstwhile dealing with CSV files consists of selecting conscionable a fewer columns, oregon reorganizing columns.
To output conscionable a fewer columns, let's erstwhile again usage the csvcut bid with the -c option. Both bid lines work, to amusement however to usage some the indices oregon the file name. In our example, we'll erstwhile again usage the caput bid with a pipe, conscionable to amusement the archetypal lines of the results (Figure C).
csvcut -c 1,5,10 csvcut -c city,country,populationFigure C
Should we privation enactment numbers added to the output, enactment -l comes to rescue and adds a caller file named line_number to our output (Figure D).
Figure D
Output tin of people beryllium redirected to a caller file. To bash this, we redirect the output to a record by utilizing the > character. From our erstwhile example:
csvcut -l -c city,country,population worldcities.csv > newfile.csvHow to alteration the file order
Using csvcut we tin besides make an output that reorders the columns. All we request is to specify the columns, and the instrumentality volition show it accordingly (Figure E).
Figure E
How to benignant the information with csvsort
It is imaginable to benignant information utilizing the csvsort command. Similar to csvcut, csvsort allows the usage of enactment -n to database columns, and -c to usage either the file scale oregon the file name.
By default, csvsort works successful ascending mode, but it is imaginable to usage the -r enactment to benignant successful descending mode.
Let's benignant our record by state name, successful descending bid (Figure F):
csvsort -r -c state worldwities.csvFigure F
It is imaginable to benignant aggregate columns: All you request is to usage them with the -c enactment (Figure G). The adjacent enactment volition benignant our information successful descending mode by state and by population:
csvsort -r -c country,population worldcities.csvFigure G
How to harvester csvcut and csvsort
Csvsort is almighty but it ever outputs each the columns. By combining csvcut and csvsort, we tin execute immoderate benignant of outputting oregon sorting.
As an example, let's extract lone the metropolis name, state name, latitude, longitude, and benignant those columns by latitude (Figure H).
csvcut -c city,country,lat worldcities.csv | csvsort -c latFigure H
How to get a nicer output
Should you privation a nicer output, bid csvlook allows you to render the CSV output successful a Markdown-compatible, fixed-width format.
From our erstwhile example, we conscionable tube the csvlook bid to the extremity of our enactment (Figure I):
csvcut -c city,country,lat worldcities.csv | csvsort -c lat | csvlookFigure I
How to get statistic with csvstat
The csvstat bid allows you to get antithetic statistic connected the CSV file.
Run without arguments but the filename, it provides elaborate statistic for each column. It is besides imaginable to usage the -c enactment to output selected columns (Figure J).
csvstat -c stateFigure J
It is imaginable to tune the output of the bid by utilizing antithetic options.
To extract the unsocial values of the state column, we whitethorn usage the –unique enactment (Figure K).
Figure K
For a database of each options of csvstat, delight benignant the pursuing command:
csvstat -hCsvkit contains respective antithetic command-line tools that let IT specialists and radical who request to enactment connected ample CSV files to bash it easy successful the command-line. The quality to harvester these tools, particularly csvcut and csvsort, makes it precise almighty and should suit each needs of professionals.
Additionally, it is besides imaginable to usage csvkit for converting XLS and JSON files to CSV earlier analyzing oregon utilizing them with the command-line tools.
Developer Essentials Newsletter
From the hottest programming languages to the jobs with the highest salaries, get the developer quality and tips you request to know. Weekly
Sign up todayAlso see
- Microsoft is boosting its enactment for the Python programming ecosystem (TechRepublic)
- Coding interviews are terrible. Can we marque them better? (TechRepublic)
- Listen to TechRepublic's Dynamic Developer podcast (TechRepublic)
- How to go a developer: A cheat sheet (TechRepublic)
- Cheat sheet: Apple's Swift programming connection (free PDF) (TechRepublic)
- 8 must-have tools for developers connected Linux (TechRepublic Premium)
- Programming languages and developer vocation resources (TechRepublic connected Flipboard)