Misc Links
Forum Archive
News Archive
File DB



Latest Forum Topics
wow 56 k modems are
Posted by Red Squirrel
on Oct 14 2013, 11:52:23 pm

I Need A Program
Posted by rovingcowboy
on Sep 23 2013, 5:37:59 pm

having trouble witn lan
Posted by rovingcowboy
on Sep 23 2013, 5:40:56 pm

new problem for me
Posted by rovingcowboy
on Sep 23 2013, 5:54:09 pm

RBC Royal Bank
Posted by Red Squirrel
on Aug 13 2013, 6:48:08 pm


Text searching in linux with grep
"Reach out and grep someone" - Bell Labs Unix
By Red Squirrel

Ever have to go through a HUGE sql backup file looking for something specific, but realizing it's impossible to open files that are 30Gigs without it draining all the ram, and taking super long to open? Grep can save your day in situations like this. Or how about If you need to find a specific string in a file - and you don't know what file it's in, grep can also save your day. Then there's searching for strings in specific command outputs, it works there too!

Grep is a wonderful command in Linux and makes the Linux environment that much easier to process large files or large amounts of data. Try opening a 30Gig file in windows, and then performing a string search, and if you have 20 of these files...... good luck! You can transfer it all over to a linux box on a 10mbps connection and then use grep, and the transfer wait will be well worth it.

The two main uses for grep is finding a line in a file that matches with a keyword, and finding a line in a command output, that matches a keyword. We'll start with files first.

Working with files I personally ran into this problem, I had 30 mysql backups where 1 was a day old, 2 was 2 days old etc... and because the main backup restore after a reformat did not work properly, I needed to go fetch the tables from a previous backup, in hopes they were there. So here am I trying to open these 1 gig sql files in my text editor in Windows, it would take like 10 minutes just to open, and performing a search was also very slow and tedious. Even after closing it, the PC would be very sluggish and require a reboot.

That's when I found out about grep so I transfered these files to the linux box, thankfully over 100mbps and not 10 like said in the above example situation. You can search in 1 file, or directory and it will return the lines that have the specified string. You can also specify with the A and B tags how many lines before and after you want to see.

This command would search for "a string" and return 3 lines before and 4 lines after. To give it a test create a file called testfile.txt and put a bunch of random stuff in it and on a few lines put "a string" without the quotes.

grep -B3 -A4 "a string" testfile.txt >grepout.log

This would output the results to grepout.log, but you can remove the >grepout.log part to have it right on the screen. Outputting to a file is not a bad idea when searching huge amounts of files since you will probably get lot of matching stuff so afterwards it will be easier to sort through.

Here's how you'd do the same thing, but instead of searching testfile.txt you want to search all the files in /etc.

grep -R -B3 -A4 "a string" /etc/*

This is surprisingly pretty quick too, here's some of what it got on my system. Just part of what appears to be a configuration file. From this point on I know that "a string" is in that file so I can vim to it and do a search (using /a string in the vim command line) and voila, found a string, without having to manually check in each file. Funny how it's talking about regular expressions, since we'll be looking at those later.

Grep with commands
Grep is not only limited to searching for files, you can also limit command output to a specified string. For example

locate spam | grep ^/etc/

The locate command is used to locate files on the hard drive. So typically locate spam would list every single file that matches spam, but that's a huge list, so using a command such as the one above will limit to /etc folder. The ^ tells it that it has to start with that. So /data/etc/spam would not match. Without the ^, it would match.

You can use grep with dir as well. If you're looking for the hosts file but don't remember if it has an extension, or if it has an s then you could use this command:

dir | grep "host"


Remember, it goes by line, so this is why pam.d and vsftpd.conf and other files are listed, as long as "host" is somewhere on that line.

Another command you can use grep with is top. Top is used to show system usage (sort of like the task manager in windows) and it updates every few seconds.

So issuing top | grep "httpd" would execute top, but only show the lines which httpd is in, so every few seconds, a new line would be printed, if, and only if, httpd is in the top section of cpu usage. This is especially good if you are monitoring a program that is using too much resources since every few seconds it will print a new line with the usage info. And if it stops printing, then you know the program ceased using up enough cpu to apear in the top list.

Grep can be used in conjunction with any command, as far as I know and it has many features not mentioned so far so the next page will simply show some examples of advanced uses of grep.

Next Page
19111 Hits Pages: [1] [2] 3 Comments

Latest comments (newest first)
Posted by Shinizo on June 06th 2005 (11:18)
Do you want an apple pie with that?
Posted by Red Squirrel on April 04th 2005 (12:13)
I think grep is only a linux thing, but there is a windows program that does the same but I forget what it's called.
Posted by Red Squirrel on February 02th 2005 (18:22)
These tests are done on RH8.0 which is considered somehwhat old so it could be possible that things got changed.

And the reason for the -l was to have 1 file per line, since the other is multiple files per line which achieves undesirable results in these examples. Though the results would of been the same if I would of put the words in a file line per line, but just wanted to show examples using real-time situations.

View all comments
Post comment

Top Articles Latest Articles
- What are .bin files for? (27540 reads)
- Proper case cooling basics (21856 reads)
- Cyberstalking - What it is and what you can do as a victim. (20243 reads)
- Php how-to for starters (19858 reads)
- How to Use MDADM Linux Raid (19479 reads)
- How to Use MDADM Linux Raid (19479 reads)
- What is Cloud Computing? (18351 reads)
- Dynamic Forum Signatures (version 2) (18447 reads)
- Successfully Hacking your iPhone or iTouch (18575 reads)
- Ultima Online Newbie Guide (19222 reads)
corner image

This site best viewed in a W3C standard browser at 800*600 or higher
Site design by Red Squirrel | Contact
© Copyright 2014 Ryan Auclair/IceTeks, All rights reserved