Misc Links
Forum Archive
News Archive
File DB
 

Ads
 

Advertisement
 

Latest Forum Topics
wow 56 k modems are
Posted by Red Squirrel
on Oct 14 2013, 11:52:23 pm

I Need A Program
Posted by rovingcowboy
on Sep 23 2013, 5:37:59 pm

having trouble witn lan
Posted by rovingcowboy
on Sep 23 2013, 5:40:56 pm

new problem for me
Posted by rovingcowboy
on Sep 23 2013, 5:54:09 pm

RBC Royal Bank
Posted by Red Squirrel
on Aug 13 2013, 6:48:08 pm

 

Text searching in linux with grep
"Reach out and grep someone" - Bell Labs Unix
By Red Squirrel


Grep --help will show you that grep is quite powerful, and it even supports regular expressions which can come in handy in some situations. All examples will use it along with dir but as shown on the first page, it can be used with any command, or directly on files.

First, here is the directory structure used in the examples:

These can be all files or folders, does not really matter. Case DOES matter, so if you want to try this out make sure you type in the right case. Hint: type touch and copy and paste all of them on the same line and it will create them in one shot. (better make a blank folder so you can just rm it afterwards). Also I put lot of files that could match with different types of rules (ex: case insensitive vs case sensitive) for the sole purpose of making it easier to play around with if you want to try out stuff.

Also note that dir -l simply lists files in a list, instead of on multiple columns, which makes it easier for these examples.

Before we start, here is a typical dir -l on our test directory:


This will give you an idea of what the data is being taken from. Also notice how the read/write permissions are not in 644 type format. This is probably done specifically for grep as it can be used to search by file permission so you can search for -rw- instead of 6 (less chance of having -rw- in a filename or directory then 6)

Our first example is case insensitivity with the -i parameter, since by default grep is case sensitive.

dir -l |grep -i base

Returns:


You can also do searches using regular expressions using the -P parameter.

dir -l |grep -P "\:[0-9][0-9] ...."

Returns:


Matches are any lines with any string that starts with :, then two digit from 0-9, a space then 4 random characters. So basically we want to only show files that are 4 characters or bigger since the only place in this type of listing where : then two numbers then a space is at the date section, and the filename is after it. If we would have left out the :[0-9][0-9] part then it would of matched with everything because of the 1, space then root. So it would only work if we had searched for files 2 chars or smaller, unless of course there was a file owned by a user that is smaller, and matches. Also note that some characters such as : need to be escaped like this \:. As this is not a regex tutorial I recommend you check one out to further understand regex, I may write one later on to go more into details about regular expression.

More with regular expressions, if you're searching for a file that might have been potentially misspelled such as adding too many l's then we can go like this:

dir -l |grep -P "mol?l?l?l?ecules"

Returns:


l? means that the l does not have to be there. So l?l?l?l? can match with llll lll ll or l or even no l. You could also put .? to indicate it can have another character in between, but does not have to. Here's an example.

dir -l |grep -P "a.?.?.?.?s"

Returns:


As long as there's an "a" 4 or less characters then a "s" it matches.

Now time to exclude stuff that matches.

dir -l |grep -v -i "s"

Returns:


The -i is for case insensivity, like shown before, and -v means to inverse everything, it shows everything that DOES NOT match. So in this case, any entry that does not have an S in it. Remember because we're using dir -l this would also mean users that have s in the name would not get listed. You'd have to use a regular expression like we did with the test for 4 letter file names. Here's how we'd do that if we wanted to exclude every file that has an "o" in it.

dir -l |grep -P -v -i "\:[0-9][0-9].+o"

Returns:


Because we only want to eliminate files with the letter "o" and not files who's user have an "o" in it, we have to make sure before the o that there's a colon, 2 digits and any number of misc characters. The .+ means "anything" since + simply means the character can be repeated any number of times.

There's way more you can do with grep, but these are probably the most likely to be used things, regular expressions, inverses, etc. grep --help shows the parameters we used and many others



I hope this article was informative to you and will make linux that much more useful in the file management world.

-Red Squirrel
Owner





spacer
30579 Hits Pages: [1] [2] 3 Comments
spacer


Latest comments (newest first)
Posted by Shinizo on June 06th 2005 (11:18)
Do you want an apple pie with that?
spacer
Posted by Red Squirrel on April 04th 2005 (12:13)
I think grep is only a linux thing, but there is a windows program that does the same but I forget what it's called.
spacer
Posted by Red Squirrel on February 02th 2005 (18:22)
These tests are done on RH8.0 which is considered somehwhat old so it could be possible that things got changed.

And the reason for the -l was to have 1 file per line, since the other is multiple files per line which achieves undesirable results in these examples. Though the results would of been the same if I would of put the words in a file line per line, but just wanted to show examples using real-time situations.

spacer
View all comments
Post comment


Top Articles Latest Articles
- What are .bin files for? (669062 reads)
- Text searching in linux with grep (161180 reads)
- Big Brother and Ndisuio.sys (150471 reads)
- PSP User's Guide (139547 reads)
- SPFDisk (Special Fdisk) Partition Manager (117240 reads)
- How to Use MDADM Linux Raid (188 reads)
- What is Cloud Computing? (1225 reads)
- Dynamic Forum Signatures (version 2) (8769 reads)
- Successfully Hacking your iPhone or iTouch (18714 reads)
- Ultima Online Newbie Guide (35906 reads)
corner image

This site best viewed in a W3C standard browser at 800*600 or higher
Site design by Red Squirrel | Contact
© Copyright 2017 Ryan Auclair/IceTeks, All rights reserved