{"id":883,"date":"2014-07-27T01:07:48","date_gmt":"2014-07-27T01:07:48","guid":{"rendered":"http:\/\/tech.avant.net\/q\/?p=883"},"modified":"2014-08-09T17:12:37","modified_gmt":"2014-08-09T17:12:37","slug":"datsize-simple-command-line-row-and-column-count","status":"publish","type":"post","link":"https:\/\/tech.avant.net\/q\/datsize-simple-command-line-row-and-column-count\/","title":{"rendered":"datsize, simple command line row and column count"},"content":{"rendered":"<p>Lately I&#8217;ve been working with lots of data files with fixed rows and columns, and have been finding myself doing the following a lot:<\/p>\n<p>Getting the row count of a file,<\/p>\n<pre class=\"sh_sh\">\r\ntwarnock@laptop:\/var\/data\/ctm :) wc -l lda_out\/final.gamma\r\n    3183 lda_out\/final.gamma\r\ntwarnock@laptop:\/var\/data\/ctm :) wc -l lda_out\/final.beta\r\n     200 lda_out\/final.beta\r\n<\/pre>\n<p>And getting the column count of the same files,<\/p>\n<pre class=\"sh_sh\">\r\ntwarnock@laptop:\/var\/data\/ctm :) head -1 lda_out\/final.gamma | awk '{ print NF }'\r\n200\r\ntwarnock@laptop:\/var\/data\/ctm :) head -1 lda_out\/final.beta | awk '{ print NF }'\r\n5568\r\n<\/pre>\n<p>I would do this for dozens of files and eventually decided to put this together in a simple shell function,<\/p>\n<pre class=\"sh_sh\">\r\nfunction datsize {\r\n    if [ -e $1 ]; then\r\n        rows=$(wc -l < $1)\r\n        cols=$(head -1 $1 | awk '{ print NF }')\r\n        echo \"$rows X $cols $1\"\r\n    else\r\n        return 1\r\n    fi\r\n}\r\n<\/pre>\n<p>Simple, and so much nicer,<\/p>\n<pre class=\"sh_sh\">\r\ntwarnock@laptop:\/var\/data\/ctm :) datsize lda_out\/final.gamma\r\n    3183 X 200 lda_out\/final.gamma\r\ntwarnock@laptop:\/var\/data\/ctm :) datsize lda_out\/final.beta\r\n     200 X 5568 lda_out\/final.beta\r\ntwarnock@laptop:\/var\/data\/ctm :) datsize ctr_out\/final-theta.dat\r\n    3183 X 200 ctr_out\/final-theta.dat\r\ntwarnock@laptop:\/var\/data\/ctm :) datsize ctr_out\/final-U.dat\r\n    2011 X 200 ctr_out\/final-U.dat\r\ntwarnock@laptop:\/var\/data\/ctm :) datsize ctr_out\/final-V.dat\r\n    3183 X 200 ctr_out\/final-V.dat\r\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Lately I&#8217;ve been working with lots of data files with fixed rows and columns, and have been finding myself doing the following a lot: Getting the row count of a file, twarnock@laptop:\/var\/data\/ctm :) wc -l lda_out\/final.gamma 3183 lda_out\/final.gamma twarnock@laptop:\/var\/data\/ctm :) wc -l lda_out\/final.beta 200 lda_out\/final.beta And getting the column count of the same files, twarnock@laptop:\/var\/data\/ctm [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[4,14],"tags":[],"_links":{"self":[{"href":"https:\/\/tech.avant.net\/q\/wp-json\/wp\/v2\/posts\/883"}],"collection":[{"href":"https:\/\/tech.avant.net\/q\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/tech.avant.net\/q\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/tech.avant.net\/q\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/tech.avant.net\/q\/wp-json\/wp\/v2\/comments?post=883"}],"version-history":[{"count":5,"href":"https:\/\/tech.avant.net\/q\/wp-json\/wp\/v2\/posts\/883\/revisions"}],"predecessor-version":[{"id":893,"href":"https:\/\/tech.avant.net\/q\/wp-json\/wp\/v2\/posts\/883\/revisions\/893"}],"wp:attachment":[{"href":"https:\/\/tech.avant.net\/q\/wp-json\/wp\/v2\/media?parent=883"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/tech.avant.net\/q\/wp-json\/wp\/v2\/categories?post=883"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/tech.avant.net\/q\/wp-json\/wp\/v2\/tags?post=883"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}