{"id":625,"date":"2012-04-24T19:27:19","date_gmt":"2012-04-24T19:27:19","guid":{"rendered":"http:\/\/tech.avant.net\/q\/?p=625"},"modified":"2019-04-30T16:20:28","modified_gmt":"2019-04-30T16:20:28","slug":"python-slice-and-sql-every-nth-row","status":"publish","type":"post","link":"https:\/\/tech.avant.net\/q\/python-slice-and-sql-every-nth-row\/","title":{"rendered":"python slice and sql every Nth row"},"content":{"rendered":"<p>I would like to retrieve every Nth row of a SQL table, and I would like this accessed via a python slice function.<\/p>\n<p>A python slice allows access to a list (or any object that implements a __getitem__ method) by a start, stop, and step &#8212; for example,<\/p>\n<pre class=\"sh_python\">&gt;&gt;&gt; foo = range(100)\n&gt;&gt;&gt; foo[5]\n5\n&gt;&gt;&gt; foo[5:10]\n[5, 6, 7, 8, 9]\n&gt;&gt;&gt; foo[5:10:2]\n[5, 7, 9]\n&gt;&gt;&gt; \n<\/pre>\n<p>The parameters of a slice behave identically to the builtin range() function,<\/p>\n<pre class=\"sh_python\">&gt;&gt;&gt; range(5,10,2)\n[5, 7, 9]\n&gt;&gt;&gt; \n<\/pre>\n<p>Given a python class that accesses a database table, you may want to implement slice functionality directly into that class. This can be done by implementing a __getitem__ method, a very simple __getitem__ looks something like this:<\/p>\n<pre class=\"sh_python\">def __getitem__(self, key):\n    curs = self.db.cursor()\n    return list(curs.execute('SELECT * FROM %s WHERE ROWID = %s' %(self.table, key+1)).fetchall()[0])\n<\/pre>\n<p>However, this fails to account for slices and will only work for single item retrievals. When only a single key is provided python will pass that key to __getitem__, and when slice parameters are provided python will pass a slice object. In other words, inspect the key to figure out if a slice is passed in, e.g.,<\/p>\n<pre class=\"sh_python\">def __getitem__(self, key):\n    if isinstance( key, slice ):\n        ... slice acccess, e.g., foo[1:5], foo[:5], foo[0:10:2]\n    else:\n        ... single element access, e.g., foo[5]\n<\/pre>\n<p>A slice object will have three parameters: slice.start, slice.stop, and slice.step; corresponding to object[start:stop:step]<\/p>\n<p>Going back to the SQL case, we&#8217;ll need to dynamically construct a SQL statement that will access every Nth row between a start and stop.<\/p>\n<p>The simplest way to do this is using a modulo operator (available in all major relational databases), for example, object[5:10:2] in sqlite<\/p>\n<pre class=\"sh_sql\">SELECT * \nFROM table\nWHERE ROWID &gt; 5 AND ROWID &lt;= 10\n  AND ROWID % 2 = 5%2\n<\/pre>\n<p>This approach allow you to retrieve every Nth row between an arbitrary start and stop, to put this into practice in python, a __getitem__ may be implemented as follows:<\/p>\n<pre class=\"sh_python\">def __getitem__(self, key):\n    curs = self.db.cursor()\n    if isinstance( key, slice ):\n        start = 0 if key.start is None else key.start\n        stop = len(self) - key.start if key.stop is None else key.stop\n        qstep = ''\n        if key.step is not None:\n           qstep = ' AND ROWID %% %s = %s ' %(key.step, (start+1) % key.step)\n        res = curs.execute('SELECT * FROM %s WHERE ROWID &gt; %s AND ROWID &lt;= %s %s'\n            %(self.table, start, stop, qstep)).fetchall()\n    else:\n        res = list(curs.execute('SELECT * FROM %s WHERE ROWID = %s' %(self.table, key+1)).fetchall()[0])\n    curs.close()\n    return res\n<\/pre>\n<p>The above code is part of a sqlite python wrapper used for parsing large csv files, available here <a href=\"https:\/\/github.com\/timwarnock\/CSVReports\">github.com\/timwarnock\/CSVReports<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>I would like to retrieve every Nth row of a SQL table, and I would like this accessed via a python slice function. A python slice allows access to a list (or any object that implements a __getitem__ method) by a start, stop, and step &#8212; for example, &gt;&gt;&gt; foo = range(100) &gt;&gt;&gt; foo[5] 5 [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[16,15,18,6],"tags":[],"_links":{"self":[{"href":"https:\/\/tech.avant.net\/q\/wp-json\/wp\/v2\/posts\/625"}],"collection":[{"href":"https:\/\/tech.avant.net\/q\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/tech.avant.net\/q\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/tech.avant.net\/q\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/tech.avant.net\/q\/wp-json\/wp\/v2\/comments?post=625"}],"version-history":[{"count":7,"href":"https:\/\/tech.avant.net\/q\/wp-json\/wp\/v2\/posts\/625\/revisions"}],"predecessor-version":[{"id":955,"href":"https:\/\/tech.avant.net\/q\/wp-json\/wp\/v2\/posts\/625\/revisions\/955"}],"wp:attachment":[{"href":"https:\/\/tech.avant.net\/q\/wp-json\/wp\/v2\/media?parent=625"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/tech.avant.net\/q\/wp-json\/wp\/v2\/categories?post=625"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/tech.avant.net\/q\/wp-json\/wp\/v2\/tags?post=625"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}