2016年4月18日 星期一

Python的資料夾比較

最近在寫讓自己合code愈來愈省時間的小工具,畢竟我不是一個非常細心的人,常常會漏東漏西,每漏一次就會覺得自己更是應該早點把這個東西弄出來。(握拳)

要合併程式碼就一定會有廠商來的A跟自己改過的A+還有廠商來的新版B三個資料夾。
這時候就會覺得python的dircmp很好用,在這邊可以看到最下方有這幾行範例讓程式去子資料夾然後顯示出檔案跟路徑。

>>> from filecmp import dircmp
>>> def print_diff_files(dcmp):
...     for name in dcmp.diff_files:
...         print "diff_file %s found in %s and %s" % (name, dcmp.left,
...               dcmp.right)
...     for sub_dcmp in dcmp.subdirs.values():
...         print_diff_files(sub_dcmp)
...
>>> dcmp = dircmp('dir1', 'dir2') 
>>> print_diff_files(dcmp) 

像我這種腦袋不好的人最討厭看到遞迴了,到底是要怎麼把找到的檔案跟路徑傳出來讓我找資料找了好一陣子。因為傳值就會扯上call by value,call by reference有的沒的問題。
用谷歌搜尋call by reference python可以看到這位大大的部落格是第二個搜尋結果!
所以我就研究了一下,這篇提到了Immutable Object and Mutable Object (不變物件與可變物件),但說實在裡面的範例我看得不是很懂,所以自己寫了程式碼看看結果。

 def LOL(mutable, immutable):  
   print  
   print "LOL start"  
   print "mutable id in LOL is %s"%(id(mutable))  
   print "immutable id in LOL is %s"%(id(immutable))  
   
   mutable.append('xyz')  
   print "mutable modifed in LOL is ",  
   print mutable  
   print "mutable id in LOL is %s"%(id(mutable))  
   
   immutable += 1  
   print "immutable plus 1 in LOL is %s"%(immutable)  
   print "immutable id in LOL is %s"%(id(immutable))  
   
   mutable = ['MMM']  
   print "mutable assigned in LOL is ",  
   print mutable  
   print "mutable id in LOL is %s"%(id(mutable))  
   
   imumutable = 5  
   print "immutable assigned by 5 in LOL is %s"%(immutable)  
   print "immutable id in LOL is %s"%(id(immutable))  
   print "LOL end"  
   print  
   
 mutable = ['abc']  
 immutable = 0  
 print "mutable id is %s"%(id(mutable))  
 print "immutable id is %s"%(id(immutable))  
   
 LOL(mutable, immutable)  
   
 print "mutable id is %s"%(id(mutable))  
 print "immutable id is %s"%(id(immutable))  
 print "mutable is ",  
 print mutable  
 print "immutable is %s"%(immutable)


基本上就是numbers, booleans, strings, tuples, frozensets這五種型態的變數都是不變物件,其餘的資料型態都是可變物件,而自己建立的類別(class)為可變物件。(參考這邊

跑出來的結果在這邊:
 mutable id is 139901048634112  
 immutable id is 21690320  
   
 LOL start  
 mutable id in LOL is 139901048634112  
 immutable id in LOL is 21690320  
 mutable modifed in LOL is ['abc', 'xyz']
 mutable id in LOL is 139901048634112    ->可變物件傳進函式之後可做運算,但物件不變
 immutable plus 1 in LOL is 1  
 immutable id in LOL is 21690296         ->不可變物件傳進函式之後可做運算,並且物件改變
 mutable assigned in LOL is ['MMM']
 mutable id in LOL is 139901048690160    ->可變物件傳進函式之後被賦值後為新物件
 immutable assigned by 5 in LOL is 1  
 immutable id in LOL is 21690296         ->不可變物件傳進函式之後不可被賦值
 LOL end  
   
 mutable id is 139901048634112  
 immutable id is 21690320  
 mutable is ['abc', 'xyz']               ->可變物件在函式中的運算會直接影響該物件
 immutable is 0                          ->不可變物件在函式中的運算不會影響該物件


從上面的執行結果(可變物件傳進函式之後可做運算,但物件不變)拿來運用在一開始的資料夾遞迴比較上,我們可以丟一個串列(list)進去存找到的檔案以及左右路徑,即可拿到我們要的結果。
 from filecmp import dircmp  
   
 def print_diff_files(dcmp, result):  
   for name in dcmp.diff_files:  
     print "diff_file %s found in %s and %s" % (name, dcmp.left, dcmp.right)  
     temp = [name, dcmp.left, dcmp.right]  
     result.append(temp)  
   for sub_dcmp in dcmp.subdirs.values():  
     print_diff_files(sub_dcmp)  
   
 result = []  
 dcmp = dircmp('dir1', 'dir2')   
 print_diff_files(dcmp, result)   

是不是很簡單呢?(才怪)

沒有留言: