jesusjsc

Pandas解析iloc、loc和ix之间的区别
# Pandas解析iloc、loc和ix之间的区别 ``` >>> import numpy a...
扫描右侧二维码阅读全文
10
2019/04

Pandas解析iloc、loc和ix之间的区别

Pandas解析iloc、loc和ix之间的区别

>>> import numpy as np
>>> import pandas as pd
>>> df1 = pd.DataFrame(np.arange(9*5).reshape((9, 5)),columns=list('abcde'))
>>> df1
    a   b   c   d   e
0   0   1   2   3   4
1   5   6   7   8   9
2  10  11  12  13  14
3  15  16  17  18  19
4  20  21  22  23  24
5  25  26  27  28  29
6  30  31  32  33  34
7  35  36  37  38  39
8  40  41  42  43  44
>>> df2 = pd.DataFrame(np.arange(9*5).reshape((9, 5)), index=['a', 'b', 'c', 'd', 'e', 1, 2, 3, 4], columns=list('abcde'))
>>> df2
    a   b   c   d   e
a   0   1   2   3   4
b   5   6   7   8   9
c  10  11  12  13  14
d  15  16  17  18  19
e  20  21  22  23  24
1  25  26  27  28  29
2  30  31  32  33  34
3  35  36  37  38  39
4  40  41  42  43  44

先看ilocloc

iloc只依据于索引的位置,因此中括号中的标签值只能为int型,如
iloc[:3] 取出的是前三行

>>> df1.iloc[:3]
    a   b   c   d   e
0   0   1   2   3   4
1   5   6   7   8   9
2  10  11  12  13  14
>>> df2.iloc[:3]
    a   b   c   d   e
a   0   1   2   3   4
b   5   6   7   8   9
c  10  11  12  13  14

loc 是按照索引的标签值来取行数的,那么中括号中的标签值不一定要求为int型,如下

>>> df1.loc[:3]  # 其中的标签值为3,取出从最开始的行到标签值为3的行
    a   b   c   d   e
0   0   1   2   3   4
1   5   6   7   8   9
2  10  11  12  13  14
3  15  16  17  18  19
>>> df2.loc[:'d']    # 其中的标签值为'd',取出从最开始的行到标签值为'd'的行
    a   b   c   d   e
a   0   1   2   3   4
b   5   6   7   8   9
c  10  11  12  13  14
d  15  16  17  18  19

再看ix

ix比较复杂,所以搞不明白的话,会掉进很多坑
ix会试着先按照loc的方式运行,即按照查找标签的方式,如果标签在索引中找不到,则会回落到按照iloc的方式运行。

>>> df1.ix[:3]  # 按照loc的方式运行,查找标签值3在索引中,所以返回的是从开始到索引值为3的行
    a   b   c   d   e
0   0   1   2   3   4
1   5   6   7   8   9
2  10  11  12  13  14
3  15  16  17  18  19
>>> 
>>> df2.ix[:3]  # 找不到标签值为3,因此按照iloc的方式,所以返回的是三行
    a   b   c   d   e
a   0   1   2   3   4
b   5   6   7   8   9
c  10  11  12  13  14
>>> df2.ix[:'d']   # 找标签为'd'
    a   b   c   d   e
a   0   1   2   3   4
b   5   6   7   8   9
c  10  11  12  13  14
d  15  16  17  18  19

再看几个例子

>>> df2.ix['b':'d', 'b':'e']
    b   c   d   e
b   6   7   8   9
c  11  12  13  14
d  16  17  18  19

loc和ix不支持同时存在不同类型的标签,否则会报错,如下:

>>> df2.ix['b':3]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jiangsichong/anaconda2/lib/python2.7/site-packages/pandas/core/indexing.py", line 87, in __getitem__
    return self._getitem_axis(key, axis=0)
  File "/home/jiangsichong/anaconda2/lib/python2.7/site-packages/pandas/core/indexing.py", line 1006, in _getitem_axis
    return self._get_slice_axis(key, axis=axis)
  File "/home/jiangsichong/anaconda2/lib/python2.7/site-packages/pandas/core/indexing.py", line 1253, in _get_slice_axis
    indexer = self._convert_slice_indexer(slice_obj, axis)
  File "/home/jiangsichong/anaconda2/lib/python2.7/site-packages/pandas/core/indexing.py", line 201, in _convert_slice_indexer
    return ax._convert_slice_indexer(key, kind=self.name)
  File "/home/jiangsichong/anaconda2/lib/python2.7/site-packages/pandas/indexes/base.py", line 1238, in _convert_slice_indexer
    indexer = self.slice_indexer(start, stop, step, kind=kind)
  File "/home/jiangsichong/anaconda2/lib/python2.7/site-packages/pandas/indexes/base.py", line 2997, in slice_indexer
    kind=kind)
  File "/home/jiangsichong/anaconda2/lib/python2.7/site-packages/pandas/indexes/base.py", line 3182, in slice_locs
    end_slice = self.get_slice_bound(end, 'right', kind)
  File "/home/jiangsichong/anaconda2/lib/python2.7/site-packages/pandas/indexes/base.py", line 3115, in get_slice_bound
    label = self._maybe_cast_slice_bound(label, side, kind)
  File "/home/jiangsichong/anaconda2/lib/python2.7/site-packages/pandas/indexes/base.py", line 3073, in _maybe_cast_slice_bound
    self._invalid_indexer('slice', label)
  File "/home/jiangsichong/anaconda2/lib/python2.7/site-packages/pandas/indexes/base.py", line 1284, in _invalid_indexer
    kind=type(key)))
TypeError: cannot do slice indexing on <class 'pandas.indexes.base.Index'> with these indexers [3] of <type 'int'>

总结

综上:iloc只依赖于索引位置(标签值只能为int),类似于普通的切片方式;loc依赖于索引中的标签;ix先尝试按照loc方式运行,如果行不通则按照iloc方式运行。

Last modification:May 27th, 2019 at 05:39 pm
If you think my article is useful to you, please feel free to appreciate

Leave a Comment