Installing scikit-learn, panda, scipy, numpy on a Mac with M1

Before installing scikit-learn, you should install its dependencies.

Installing Pandas

Pandas installs fine with pip install pandas. However, when I tried to import it in my Jupyter Lab notebook, it crashed with this error:

ValueError                                Traceback (most recent call last)
<ipython-input-2-b6c7f9fc9652> in <module>
----> 1 import pandas as pd

~/miniforge3/envs/tf25/lib/python3.9/site-packages/pandas/__init__.py in <module>
     27 
     28 try:
---> 29     from pandas._libs import hashtable as _hashtable, lib as _lib, tslib as _tslib
     30 except ImportError as e:  # pragma: no cover
     31     # hack but overkill to use re

~/miniforge3/envs/tf25/lib/python3.9/site-packages/pandas/_libs/__init__.py in <module>
     11 
     12 
---> 13 from pandas._libs.interval import Interval
     14 from pandas._libs.tslibs import (
     15     NaT,

pandas/_libs/interval.pyx in init pandas._libs.interval()

ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject

This StackOverflow post recommends to upgrade numpy to 1.20+, but since I am using TensorFlow, I am stuck with 1.19.5. They mention a GitHub ticket which expands on the solutions. One simple solution that could work is to run pip install with some additional flags (--no-cache-dir --no-binary :all:) that supposedly compiles the package you are trying to install using the local version of numpy.

Another person suggests using older packages. I had installed Pandas 1.2.4. They're using Pandas 1.1.2. Incidentally they were using numpy 1.20. I looked for a compatibility table that would tell me which versions of Panda support numpy<1.20. I looked at https://pypi.org/project/pandas/, https://pandas.pydata.org/pandas-docs/stable/getting_started/install.html and other pages to no avail.

Eventually I tried to just install the mentioned version with pip and it failed:

% pip install pandas==1.1.2
Collecting pandas==1.1.2
  Downloading pandas-1.1.2.tar.gz (5.2 MB)
     |████████████████████████████████| 5.2 MB 2.1 MB/s 
  Installing build dependencies ... error
  ERROR: Command errored out with exit status 1:
   command: /Users/anhtuan/miniforge3/envs/tf25/bin/python3.9 /private/var/folders/ym/2b7pw1yn0v71ybqwb07t4vw00000gn/T/pip-standalone-pip-7my5a2_g/__env_pip__.zip/pip install --ignore-installed --no-user --prefix /private/var/folders/ym/2b7pw1yn0v71ybqwb07t4vw00000gn/T/pip-build-env-8qv_ct5z/overlay --no-warn-script-location --no-binary :none: --only-binary :none: -i https://pypi.org/simple -- setuptools wheel 'Cython>=0.29.16,<3' 'numpy==1.15.4; python_version=='"'"'3.6'"'"' and platform_system!='"'"'AIX'"'"'' 'numpy==1.15.4; python_version=='"'"'3.7'"'"' and platform_system!='"'"'AIX'"'"'' 'numpy==1.17.3; python_version>='"'"'3.8'"'"' and platform_system!='"'"'AIX'"'"'' 'numpy==1.16.0; python_version=='"'"'3.6'"'"' and platform_system=='"'"'AIX'"'"'' 'numpy==1.16.0; python_version=='"'"'3.7'"'"' and platform_system=='"'"'AIX'"'"'' 'numpy==1.17.3; python_version>='"'"'3.8'"'"' and platform_system=='"'"'AIX'"'"''
...
many more angry red lines

So I tried with conda install:

% conda install pandas==1.1.2
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.

PackagesNotFoundError: The following packages are not available from current channels:

  - pandas==1.1.2

Current channels:

  - https://conda.anaconda.org/conda-forge/osx-arm64
  - https://conda.anaconda.org/conda-forge/noarch

But it wasn't available. On https://conda.anaconda.org/conda-forge/osx-arm64/ I saw that pandas was available from version 1.1.3 onwards, so I installed that one:

conda install pandas==1.1.3

And this time it worked. import pandas ran fine and I was able to use the library to parse some CSV file.

Things I could try further:

  • pip install the latest version with the custom flags.
  • conda install pandas with a more recent version.

Installing scipy and numpy

If you install these with pip, it fails with thousands of lines of red logs. To install it on a Mac with M1, you have to use Conda instead.

% conda install scikit-learn
Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /Users/anhtuan/miniforge3/envs/tf25

  added / updated specs:
    - scikit-learn


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    scikit-learn-0.24.2        |   py39hab69601_0         6.6 MB  conda-forge
    ------------------------------------------------------------
                                           Total:         6.6 MB

The following NEW packages will be INSTALLED:

  joblib             conda-forge/noarch::joblib-1.0.1-pyhd8ed1ab_0
  scikit-learn       conda-forge/osx-arm64::scikit-learn-0.24.2-py39hab69601_0
  threadpoolctl      conda-forge/noarch::threadpoolctl-2.1.0-pyh5ca1d4c_0


Proceed ([y]/n)? y


Downloading and Extracting Packages
scikit-learn-0.24.2  | 6.6 MB    | ####################################################################################################################################################################### | 100% 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done