Work more efficiently with log files using Linux csvkit
All IT pros and incident handlers have to deal almost daily with log files from various sources. Learn to work more quickly and efficiently to get the best out of CSV files with csvkit on Linux.
All IT pros and incidental handlers person to woody astir regular with log files from assorted sources. Learn to enactment much rapidly and efficiently to get the champion retired of CSV files with csvkit connected Linux.
CSV files are often imported into Excel oregon LibreOffice bundle earlier being utilized and analyzed. It is precise convenient and comfortable, arsenic agelong arsenic the files are not excessively big. But immoderate log files mightiness incorporate billions of lines, which makes it intolerable to import it into spreadsheets. Or you mightiness request to analyse files remotely connected headless servers without immoderate quality to usage a graphical idiosyncratic interface.
SEE: Hiring Kit: JavaScript Developer (TechRepublic Premium)
Luckily, an casual solution is disposable connected each Linux operating systems: the csvkit software.
How to instal csvkit
With the instrumentality being disposable successful the modular repositories, it is highly casual to install. In this article, we'll usage an Ubuntu-based operating system.
Let's contented the installation successful a command-line ammunition by executing:sudo apt instal csvkit
That's it. The strategy present installs the instrumentality and each the indispensable dependencies.
How to enactment connected a CSV file
To exemplify our point, we'll enactment connected a CSV record from SimpleMaps.com containing a database of cities and accusation astir them: country, longitude, latitude, colonisation and more.
The archetypal enactment of the CSV record shows the antithetic file names, arsenic is often the lawsuit with CSV files. We tin spot it with the "head" command, which by default shows the archetypal 10 lines of a record (Figure A).
Figure A
How to fig retired the columns of the file
Now let's commencement utilizing csvcut from the command-line, 1 of the tools embedded successful the csvkit. Launching the adjacent bid volition automatically amusement the named columns and the indices (Figure B):csvcut -n
Figure B
We mightiness past usage either the indices oregon the file names to code it.
How to output selected columns
One of the astir communal operations erstwhile dealing with CSV files consists of selecting conscionable a fewer columns, oregon reorganizing columns.
To output conscionable a fewer columns, let's erstwhile again usage the csvcut bid with the -c option. Both bid lines work, to amusement however to usage some the indices oregon the file name. In our example, we'll erstwhile again usage the caput bid with a pipe, conscionable to amusement the archetypal lines of the results (Figure C).csvcut -c 1,5,10 csvcut -c city,country,population
Figure C
Should we privation enactment numbers added to the output, enactment -l comes to rescue and adds a caller file named line_number to our output (Figure D).
Figure D
Output tin of people beryllium redirected to a caller file. To bash this, we redirect the output to a record by utilizing the > character. From our erstwhile example:csvcut -l -c city,country,population worldcities.csv > newfile.csv
How to alteration the file order
Using csvcut we tin besides make an output that reorders the columns. All we request is to specify the columns, and the instrumentality volition show it accordingly (Figure E).
Figure E
How to benignant the information with csvsort
It is imaginable to benignant information utilizing the csvsort command. Similar to csvcut, csvsort allows the usage of enactment -n to database columns, and -c to usage either the file scale oregon the file name.
By default, csvsort works successful ascending mode, but it is imaginable to usage the -r enactment to benignant successful descending mode.
Let's benignant our record by state name, successful descending bid (Figure F):csvsort -r -c state worldwities.csv
Figure F
It is imaginable to benignant aggregate columns: All you request is to usage them with the -c enactment (Figure G). The adjacent enactment volition benignant our information successful descending mode by state and by population:csvsort -r -c country,population worldcities.csv
Figure G
How to harvester csvcut and csvsort
Csvsort is almighty but it ever outputs each the columns. By combining csvcut and csvsort, we tin execute immoderate benignant of outputting oregon sorting.
As an example, let's extract lone the metropolis name, state name, latitude, longitude, and benignant those columns by latitude (Figure H).csvcut -c city,country,lat worldcities.csv | csvsort -c lat
Figure H
How to get a nicer output
Should you privation a nicer output, bid csvlook allows you to render the CSV output successful a Markdown-compatible, fixed-width format.
From our erstwhile example, we conscionable tube the csvlook bid to the extremity of our enactment (Figure I):csvcut -c city,country,lat worldcities.csv | csvsort -c lat | csvlook
Figure I
How to get statistic with csvstat
The csvstat bid allows you to get antithetic statistic connected the CSV file.
Run without arguments but the filename, it provides elaborate statistic for each column. It is besides imaginable to usage the -c enactment to output selected columns (Figure J).csvstat -c state
Figure J
It is imaginable to tune the output of the bid by utilizing antithetic options.
To extract the unsocial values of the state column, we whitethorn usage the –unique enactment (Figure K).
Figure K
For a database of each options of csvstat, delight benignant the pursuing command:csvstat -h
Csvkit contains respective antithetic command-line tools that let IT specialists and radical who request to enactment connected ample CSV files to bash it easy successful the command-line. The quality to harvester these tools, particularly csvcut and csvsort, makes it precise almighty and should suit each needs of professionals.
Additionally, it is besides imaginable to usage csvkit for converting XLS and JSON files to CSV earlier analyzing oregon utilizing them with the command-line tools.
Developer Essentials Newsletter
From the hottest programming languages to the jobs with the highest salaries, get the developer quality and tips you request to know. Weekly Sign up today
Also see
- Microsoft is boosting its enactment for the Python programming ecosystem (TechRepublic)
- Coding interviews are terrible. Can we marque them better? (TechRepublic)
- Listen to TechRepublic's Dynamic Developer podcast (TechRepublic)
- How to go a developer: A cheat sheet (TechRepublic)
- Cheat sheet: Apple's Swift programming connection (free PDF) (TechRepublic)
- 8 must-have tools for developers connected Linux (TechRepublic Premium)
- Programming languages and developer vocation resources (TechRepublic connected Flipboard)
What's Your Reaction?