Merge lp://staging/~parthm/bzr/503670-grep-plugin into lp://staging/bzr

Proposed by Parth Malwankar
Status: Rejected
Rejected by: Andrew Bennetts
Proposed branch: lp://staging/~parthm/bzr/503670-grep-plugin
Merge into: lp://staging/bzr
Diff against target: 418 lines (+403/-0)
3 files modified
grep/__init__.py (+132/-0)
grep/grep.py (+61/-0)
grep/test_grep.py (+210/-0)
To merge this branch: bzr merge lp://staging/~parthm/bzr/503670-grep-plugin
Reviewer Review Type Date Requested Status
Martin Packman (community) Abstain
bzr-core Pending
Review via email: mp+20067@code.staging.launchpad.net
To post a comment you must log in.
Revision history for this message
Parth Malwankar (parthm) wrote :

=== Bug #503670 ===

This is not really a request to merge. Based on the discussion on bug #503670, I have started working on the lp:bzr-grep plugin. For lack of a better mechanism for review, based on the suggestion by poolie on IRC, I am raising this request so that the plugin can be reviewed and improved.

'bzr grep' currently searches the versioned files for a pattern and support common grep options. I have been able to test it on Linux (automated) and Windows (manual). Before adding history support, I would appreciate any review comment on this. Also, any pointers on history support will also be useful to me :-)

You can get the plugin at lp:bzr-grep if you wish to try it.

All the options listed in the help below are working:

[~]% bzr help grep
Purpose: Print lines matching PATTERN for specified files.
Usage: bzr grep PATTERN [PATH...]

Options:
  --from-root Search for pattern starting from the root of the branch.
                     (implies --recursive)
  -Z, --null Write an ascii NUL (\0) separator between output lines
                     rather than a newline.
  -v, --verbose Display more information.
  -R, --recursive Recurse into subdirectories.
  -q, --quiet Only display errors and warnings.
  -i, --ignore-case ignore case distinctions while matching.
  --usage Show usage message and options.
  -n, --line-number show 1-based line number.
  -h, --help Show help message.

From: plugin "grep"
See also: plugins/grep
[~]%

Revision history for this message
Martin Packman (gz) wrote :

Think I prefer the other version, trying to recreate grep in python seems a little daft.

Misses `-r --include=*.py` which is one of my more commonly used set of arguments.

review: Abstain
Revision history for this message
Parth Malwankar (parthm) wrote :

Thanks for reviewing the code.

> Think I prefer the other version, trying to recreate grep in python seems a
> little daft.
>

Having a plugin would allow us to have good support for history and view.
Some users might be able to use 'bzr ls [opts] | grep' to achieve the
same results but that would require grep installed on windows (not very common) and we
would end up with bug #503670 :-)

IMO haveing grep built on bzrlib API is a better approach. hg and git seem to do the
same as they also provide features like searching history. This would allow
us to provide similar features and possibly more in the future.

> Misses `-r --include=*.py` which is one of my more commonly used set of
> arguments.

Thanks for pointing out the use case.
If the general approach of the plugin seems ok (API used, any oversights etc.)
to to those more familiar with internals than me I hope to enhance it to add
views and history support along with more options over time.

Revision history for this message
Andrew Bennetts (spiv) wrote :

Unmerged revisions

5057. By Parth Malwankar

grep plugin for review. [branch to be discarded]

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== added directory 'grep'
2=== added file 'grep/__init__.py'
3--- grep/__init__.py 1970-01-01 00:00:00 +0000
4+++ grep/__init__.py 2010-02-24 16:34:15 +0000
5@@ -0,0 +1,132 @@
6+# Copyright (C) 2010 Canonical Ltd
7+# Copyright (C) 2010 Parth Malwankar <parth.malwankar@gmail.com>
8+#
9+# This program is free software; you can redistribute it and/or modify
10+# it under the terms of the GNU General Public License as published by
11+# the Free Software Foundation; either version 2 of the License, or
12+# (at your option) any later version.
13+#
14+# This program is distributed in the hope that it will be useful,
15+# but WITHOUT ANY WARRANTY; without even the implied warranty of
16+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
17+# GNU General Public License for more details.
18+#
19+# You should have received a copy of the GNU General Public License
20+# along with this program; if not, write to the Free Software
21+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
22+"""bzr grep"""
23+
24+import os
25+import sys
26+
27+from bzrlib import errors
28+from bzrlib.commands import Command, register_command, display_command
29+from bzrlib.option import (
30+ Option,
31+ )
32+
33+from bzrlib.lazy_import import lazy_import
34+lazy_import(globals(), """
35+import re
36+
37+import grep
38+import bzrlib
39+from bzrlib import (
40+ osutils,
41+ bzrdir,
42+ trace,
43+ )
44+""")
45+
46+version_info = (0, 1)
47+
48+class cmd_grep(Command):
49+ """Print lines matching PATTERN for specified files.
50+ """
51+
52+ takes_args = ['pattern', 'path*']
53+ takes_options = [
54+ 'verbose',
55+ Option('line-number', short_name='n',
56+ help='show 1-based line number.'),
57+ Option('ignore-case', short_name='i',
58+ help='ignore case distinctions while matching.'),
59+ Option('recursive', short_name='R',
60+ help='Recurse into subdirectories.'),
61+ Option('from-root',
62+ help='Search for pattern starting from the root of the branch. '
63+ '(implies --recursive)'),
64+ Option('null', short_name='Z',
65+ help='Write an ascii NUL (\\0) separator '
66+ 'between output lines rather than a newline.'),
67+ ]
68+
69+
70+ @display_command
71+ def run(self, verbose=False, ignore_case=False, recursive=False, from_root=False,
72+ null=False, line_number=False, path_list=None, pattern=None):
73+ if path_list == None:
74+ path_list = ['.']
75+ else:
76+ if from_root:
77+ raise errors.BzrCommandError('cannot specify both --from-root and PATH.')
78+
79+ eol_marker = '\n'
80+ if null:
81+ eol_marker = '\0'
82+
83+ re_flags = 0
84+ if ignore_case:
85+ re_flags = re.IGNORECASE
86+ patternc = grep.compile_pattern(pattern, re_flags)
87+
88+ for path in path_list:
89+ tree, branch, relpath = bzrdir.BzrDir.open_containing_tree_or_branch(path)
90+
91+ if osutils.isdir(path):
92+ # setup rpath to open files relative to cwd
93+ rpath = relpath
94+ if relpath:
95+ rpath = osutils.pathjoin('..',relpath)
96+
97+ tree.lock_read()
98+ try:
99+ if from_root:
100+ # start searching recursively from root
101+ relpath=None
102+ recursive=True
103+
104+ for fp, fc, fkind, fid, entry in tree.list_files(include_root=False,
105+ from_dir=relpath, recursive=recursive):
106+ if fc == 'V' and fkind == 'file':
107+ grep.file_grep(tree, fid, rpath, fp, patternc,
108+ eol_marker, self.outf, line_number)
109+ finally:
110+ tree.unlock()
111+ else:
112+ id = tree.path2id(path)
113+ if not id:
114+ trace.warning("warning: file '%s' is not versioned." % path)
115+ continue
116+ tree.lock_read()
117+ try:
118+ grep.file_grep(tree, id, '.', path, patternc, eol_marker,
119+ self.outf, line_number)
120+ finally:
121+ tree.unlock()
122+
123+register_command(cmd_grep)
124+
125+def test_suite():
126+ from bzrlib.tests import TestUtil
127+
128+ suite = TestUtil.TestSuite()
129+ loader = TestUtil.TestLoader()
130+ testmod_names = [
131+ 'test_grep',
132+ ]
133+
134+ suite.addTest(loader.loadTestsFromModuleNames(
135+ ["%s.%s" % (__name__, tmn) for tmn in testmod_names]))
136+ return suite
137+
138
139=== added file 'grep/grep.py'
140--- grep/grep.py 1970-01-01 00:00:00 +0000
141+++ grep/grep.py 2010-02-24 16:34:15 +0000
142@@ -0,0 +1,61 @@
143+# Copyright (C) 2010 Canonical Ltd
144+# Copyright (C) 2010 Parth Malwankar <parth.malwankar@gmail.com>
145+#
146+# This program is free software; you can redistribute it and/or modify
147+# it under the terms of the GNU General Public License as published by
148+# the Free Software Foundation; either version 2 of the License, or
149+# (at your option) any later version.
150+#
151+# This program is distributed in the hope that it will be useful,
152+# but WITHOUT ANY WARRANTY; without even the implied warranty of
153+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
154+# GNU General Public License for more details.
155+#
156+# You should have received a copy of the GNU General Public License
157+# along with this program; if not, write to the Free Software
158+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
159+"""bzr grep"""
160+
161+
162+from bzrlib.lazy_import import lazy_import
163+lazy_import(globals(), """
164+import os
165+import re
166+
167+from bzrlib import (
168+ osutils,
169+ errors,
170+ lazy_regex,
171+ )
172+""")
173+
174+def compile_pattern(pattern, flags=0):
175+ patternc = None
176+ try:
177+ # use python's re.compile as we need to catch re.error in case of bad pattern
178+ lazy_regex.reset_compile()
179+ patternc = re.compile(pattern, flags)
180+ except re.error, e:
181+ raise errors.BzrError("Invalid pattern: '%s'" % pattern)
182+ return patternc
183+
184+
185+def file_grep(tree, id, relpath, path, patternc, eol_marker, outf, line_number=True):
186+ if relpath:
187+ path = osutils.normpath(osutils.pathjoin(relpath, path))
188+ path = path.replace('\\', '/')
189+ path = path.replace(relpath + '/', '', 1)
190+ fmt_with_n = path + ":%d:%s" + eol_marker
191+ fmt_without_n = path + ":%s" + eol_marker
192+
193+ index = 1
194+ for line in tree.get_file_lines(id):
195+ res = patternc.search(line)
196+ if res:
197+ if line_number:
198+ outf.write(fmt_with_n % (index, line.strip()))
199+ else:
200+ outf.write(fmt_without_n % (line.strip(),))
201+ index += 1
202+
203+
204
205=== added file 'grep/test_grep.py'
206--- grep/test_grep.py 1970-01-01 00:00:00 +0000
207+++ grep/test_grep.py 2010-02-24 16:34:15 +0000
208@@ -0,0 +1,210 @@
209+# Copyright (C) 2010 Canonical Ltd
210+# Copyright (C) 2010 Parth Malwankar <parth.malwankar@gmail.com>
211+#
212+# This program is free software; you can redistribute it and/or modify
213+# it under the terms of the GNU General Public License as published by
214+# the Free Software Foundation; either version 2 of the License, or
215+# (at your option) any later version.
216+#
217+# This program is distributed in the hope that it will be useful,
218+# but WITHOUT ANY WARRANTY; without even the implied warranty of
219+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
220+# GNU General Public License for more details.
221+#
222+# You should have received a copy of the GNU General Public License
223+# along with this program; if not, write to the Free Software
224+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
225+
226+import os
227+import re
228+
229+from bzrlib import tests
230+
231+open('/home/parthm/tmp/re.txt', 'w')
232+
233+class TestGrep(tests.TestCaseWithTransport):
234+ def _str_contains(self, base, pattern, flags=re.MULTILINE|re.DOTALL):
235+ res = re.findall(pattern, base, flags)
236+ return res != []
237+
238+ def _mk_file(self, path, line_prefix, total_lines, versioned):
239+ text=''
240+ for i in range(total_lines):
241+ text += line_prefix + str(i+1) + "\n"
242+
243+ open(path, 'w').write(text)
244+ if versioned:
245+ self.run_bzr(['add', path])
246+ self.run_bzr(['ci', '-m', '"' + path + '"'])
247+
248+ def _mk_unversioned_file(self, path, line_prefix='line', total_lines=10):
249+ self._mk_file(path, line_prefix, total_lines, versioned=False)
250+
251+ def _mk_versioned_file(self, path, line_prefix='line', total_lines=10):
252+ self._mk_file(path, line_prefix, total_lines, versioned=True)
253+
254+ def _mk_dir(self, path, versioned):
255+ os.mkdir(path)
256+ if versioned:
257+ self.run_bzr(['add', path])
258+ self.run_bzr(['ci', '-m', '"' + path + '"'])
259+
260+ def _mk_unversioned_dir(self, path):
261+ self._mk_dir(path, versioned=False)
262+
263+ def _mk_versioned_dir(self, path):
264+ self._mk_dir(path, versioned=True)
265+
266+ def test_basic_unversioned_file(self):
267+ """search for pattern in specfic file. should issue warning."""
268+ wd = 'foobar0'
269+ self.make_branch_and_tree(wd)
270+ os.chdir(wd)
271+ self._mk_unversioned_file('file0.txt')
272+ out, err = self.run_bzr(['grep', 'line1', 'file0.txt'])
273+ self.assertFalse(out, self._str_contains(out, "file0.txt:line1"))
274+ self.assertTrue(err, self._str_contains(err, "warning:.*file0.txt.*not versioned\."))
275+
276+ def test_basic_versioned_file(self):
277+ """search for pattern in specfic file"""
278+ wd = 'foobar0'
279+ self.make_branch_and_tree(wd)
280+ os.chdir(wd)
281+ self._mk_versioned_file('file0.txt')
282+ out, err = self.run_bzr(['grep', 'line1', 'file0.txt'])
283+ self.assertTrue(out, self._str_contains(out, "file0.txt:line1"))
284+ self.assertFalse(err, self._str_contains(err, "warning:.*file0.txt.*not versioned\."))
285+
286+ def test_multiple_files(self):
287+ """search for pattern in multiple files"""
288+ wd = 'foobar0'
289+ self.make_branch_and_tree(wd)
290+ os.chdir(wd)
291+ self._mk_versioned_file('file0.txt', total_lines=2)
292+ self._mk_versioned_file('file1.txt', total_lines=2)
293+ self._mk_versioned_file('file2.txt', total_lines=2)
294+ out, err = self.run_bzr(['grep', 'line[1-2]'])
295+
296+ self.assertTrue(out, self._str_contains(out, "file0.txt:line1"))
297+ self.assertTrue(out, self._str_contains(out, "file0.txt:line2"))
298+ self.assertTrue(out, self._str_contains(out, "file1.txt:line1"))
299+ self.assertTrue(out, self._str_contains(out, "file1.txt:line2"))
300+ self.assertTrue(out, self._str_contains(out, "file2.txt:line1"))
301+ self.assertTrue(out, self._str_contains(out, "file2.txt:line2"))
302+
303+ def test_null_option(self):
304+ """--null option should use NUL instead of newline"""
305+ wd = 'foobar0'
306+ self.make_branch_and_tree(wd)
307+ os.chdir(wd)
308+ self._mk_versioned_file('file0.txt', total_lines=3)
309+
310+ out, err = self.run_bzr(['grep', '--null', 'line[1-3]'])
311+ self.assertTrue(out == "file0.txt:line1\0file0.txt:line2\0file0.txt:line3\0")
312+
313+ out, err = self.run_bzr(['grep', '-Z', 'line[1-3]'])
314+ self.assertTrue(out == "file0.txt:line1\0file0.txt:line2\0file0.txt:line3\0")
315+
316+ def test_versioned_file_in_dir_no_recurse(self):
317+ """should not recurse without -R"""
318+ wd = 'foobar0'
319+ self.make_branch_and_tree(wd)
320+ os.chdir(wd)
321+ self._mk_versioned_dir('dir0')
322+ self._mk_versioned_file('dir0/file0.txt')
323+ out, err = self.run_bzr(['grep', 'line1'])
324+ self.assertFalse(out, self._str_contains(out, "file0.txt:line1"))
325+
326+ def test_versioned_file_in_dir_recurse(self):
327+ """should find pattern in hierarchy with -R"""
328+ wd = 'foobar0'
329+ self.make_branch_and_tree(wd)
330+ os.chdir(wd)
331+ self._mk_versioned_dir('dir0')
332+ self._mk_versioned_file('dir0/file0.txt')
333+ out, err = self.run_bzr(['grep', '-R', 'line1'])
334+ self.assertTrue(out, self._str_contains(out, "^dir0/file0.txt:line1"))
335+ out, err = self.run_bzr(['grep', '--recursive', 'line1'])
336+ self.assertTrue(out, self._str_contains(out, "^dir0/file0.txt:line1"))
337+
338+ def test_versioned_file_within_dir(self):
339+ """search for pattern while in nested dir"""
340+ wd = 'foobar0'
341+ self.make_branch_and_tree(wd)
342+ os.chdir(wd)
343+ self._mk_versioned_dir('dir0')
344+ self._mk_versioned_file('dir0/file0.txt')
345+ os.chdir('dir0')
346+ out, err = self.run_bzr(['grep', 'line1'])
347+ self.assertTrue(out, self._str_contains(out, "^file0.txt:line1"))
348+
349+ def test_versioned_file_within_dir_two_levels(self):
350+ """search for pattern while in nested dir (two levels)"""
351+ wd = 'foobar0'
352+ self.make_branch_and_tree(wd)
353+ os.chdir(wd)
354+ self._mk_versioned_dir('dir0')
355+ self._mk_versioned_dir('dir0/dir1')
356+ self._mk_versioned_file('dir0/dir1/file0.txt')
357+ os.chdir('dir0')
358+ out, err = self.run_bzr(['grep', '-R', 'line1'])
359+ self.assertTrue(out, self._str_contains(out, "^dir1/file0.txt:line1"))
360+ out, err = self.run_bzr(['grep', '--from-root', 'line1'])
361+ self.assertTrue(out, self._str_contains(out, "^dir0/dir1/file0.txt:line1"))
362+ out, err = self.run_bzr(['grep', 'line1'])
363+ self.assertFalse(out, self._str_contains(out, "file0.txt"))
364+
365+ def test_ignore_case_no_match(self):
366+ """match fails without --ignore-case"""
367+ wd = 'foobar0'
368+ self.make_branch_and_tree(wd)
369+ os.chdir(wd)
370+ self._mk_versioned_file('file0.txt')
371+ out, err = self.run_bzr(['grep', 'LinE1', 'file0.txt'])
372+ self.assertFalse(out, self._str_contains(out, "file0.txt:line1"))
373+
374+ def test_ignore_case_match(self):
375+ """match fails without --ignore-case"""
376+ wd = 'foobar0'
377+ self.make_branch_and_tree(wd)
378+ os.chdir(wd)
379+ self._mk_versioned_file('file0.txt')
380+ out, err = self.run_bzr(['grep', '-i', 'LinE1', 'file0.txt'])
381+ self.assertTrue(out, self._str_contains(out, "file0.txt:line1"))
382+ out, err = self.run_bzr(['grep', '--ignore-case', 'LinE1', 'file0.txt'])
383+ self.assertTrue(out, self._str_contains(out, "^file0.txt:line1"))
384+
385+ def test_from_root_fail(self):
386+ """match should fail without --from-root"""
387+ wd = 'foobar0'
388+ self.make_branch_and_tree(wd)
389+ os.chdir(wd)
390+ self._mk_versioned_file('file0.txt')
391+ self._mk_versioned_dir('dir0')
392+ os.chdir('dir0')
393+ out, err = self.run_bzr(['grep', 'line1'])
394+ self.assertFalse(out, self._str_contains(out, ".*file0.txt:line1"))
395+
396+ def test_from_root_pass(self):
397+ """match pass with --from-root"""
398+ wd = 'foobar0'
399+ self.make_branch_and_tree(wd)
400+ os.chdir(wd)
401+ self._mk_versioned_file('file0.txt')
402+ self._mk_versioned_dir('dir0')
403+ os.chdir('dir0')
404+ out, err = self.run_bzr(['grep', '--from-root', 'line1'])
405+ self.assertTrue(out, self._str_contains(out, ".*file0.txt:line1"))
406+
407+ def test_with_line_number(self):
408+ """search for pattern with --line-number"""
409+ wd = 'foobar0'
410+ self.make_branch_and_tree(wd)
411+ os.chdir(wd)
412+ self._mk_versioned_file('file0.txt')
413+ out, err = self.run_bzr(['grep', '--line-number', 'line3', 'file0.txt'])
414+ self.assertTrue(out, self._str_contains(out, "file0.txt:3:line1"))
415+ out, err = self.run_bzr(['grep', '-n', 'line1', 'file0.txt'])
416+ self.assertTrue(out, self._str_contains(out, "file0.txt:1:line1"))
417+
418+