python常用機器學習及深度學習庫介紹（總結分享）

本篇文章給大家帶來了關於的相關知識，其中主要介紹了機器學習、深度學習庫總結，其中包含了大量的範例，下面一起來看一下，希望對大家有幫助。

推薦學習：

前言

目前，隨著人工智慧的大熱，吸引了諸多行業對於人工智慧的關注，同時也迎來了一波又一波的人工智慧學習的熱潮，雖然人工智慧背後的原理並不能通過短短一文給予詳細介紹，但是像所有學科一樣，我們並不需要從頭開始」造輪子「，可以通過使用豐富的人工智慧框架來快速構建人工智慧模型，從而入門人工智慧的潮流。

人工智慧指的是一系列使機器能夠像人類一樣處理資訊的技術；機器學習是利用計算機程式設計從歷史資料中學習，對新資料進行預測的過程；神經網路是基於生物大腦結構和特徵的機器學習的計算機模型；深度學習是機器學習的一個子集，它處理大量的非結構化資料，如人類的語音、文字和影象。因此，這些概念在層次上是相互依存的，人工智慧是最廣泛的術語，而深度學習是最具體的：

人工智慧

為了大家能夠對人工智慧常用的 Python 庫有一個初步的瞭解，以選擇能夠滿足自己需求的庫進行學習，對目前較為常見的人工智慧庫進行簡要全面的介紹。

思維導圖

python常用機器學習及深度學習庫介紹

1、 Numpy

NumPy(Numerical Python)是 Python的一個擴充套件程式庫，支援大量的維度陣列與矩陣運算，此外也針對陣列運算提供大量的數學函數庫，Numpy底層使用C語言編寫，陣列中直接儲存物件，而不是儲存物件指標，所以其運算效率遠高於純Python代碼。
我們可以在範例中對比下純Python與使用Numpy庫在計算列表sin值的速度對比：

import numpy as npimport mathimport randomimport time

start = time.time()for i in range(10):
    list_1 = list(range(1,10000))
    for j in range(len(list_1)):
        list_1[j] = math.sin(list_1[j])print("使用純Python用時{}s".format(time.time()-start))start = time.time()for i in range(10):
    list_1 = np.array(np.arange(1,10000))
    list_1 = np.sin(list_1)print("使用Numpy用時{}s".format(time.time()-start))

從如下執行結果，可以看到使用 Numpy 庫的速度快於純 Python 編寫的程式碼：

使用純Python用時0.017444372177124023s
使用Numpy用時0.001619577407836914s

2、 OpenCV

OpenCV 是一個的跨平臺計算機視覺庫，可以執行在 Linux、Windows 和 Mac OS 作業系統上。它輕量級而且高效——由一系列 C 函數和少量 C++ 類構成，同時也提供了 Python 介面，實現了影象處理和計算機視覺方面的很多通用演演算法。
下面程式碼嘗試使用一些簡單的濾鏡，包括圖片的平滑處理、高斯模糊等：

import numpy as npimport cv2 as cvfrom matplotlib import pyplot as plt
img = cv.imread('h89817032p0.png')kernel = np.ones((5,5),np.float32)/25dst = cv.filter2D(img,-1,kernel)blur_1 = cv.GaussianBlur(img,(5,5),0)blur_2 = cv.bilateralFilter(img,9,75,75)plt.figure(figsize=(10,10))plt.subplot(221),plt.imshow(img[:,:,::-1]),plt.title('Original')plt.xticks([]), plt.yticks([])plt.subplot(222),plt.imshow(dst[:,:,::-1]),plt.title('Averaging')plt.xticks([]), plt.yticks([])plt.subplot(223),plt.imshow(blur_1[:,:,::-1]),plt.title('Gaussian')plt.xticks([]), plt.yticks([])plt.subplot(224),plt.imshow(blur_1[:,:,::-1]),plt.title('Bilateral')plt.xticks([]), plt.yticks([])plt.show()

OpenCV

可以參考OpenCV影象處理基礎（變換和去噪），瞭解更多 OpenCV 影象處理操作。

3、 Scikit-image

scikit-image是基於scipy的影象處理庫，它將圖片作為numpy陣列進行處理。
例如，可以利用scikit-image改變圖片比例，scikit-image提供了rescale、resize以及downscale_local_mean等函數。

from skimage import data, color, iofrom skimage.transform import rescale, resize, downscale_local_mean

image = color.rgb2gray(io.imread('h89817032p0.png'))image_rescaled = rescale(image, 0.25, anti_aliasing=False)image_resized = resize(image, (image.shape[0] // 4, image.shape[1] // 4),
                       anti_aliasing=True)image_downscaled = downscale_local_mean(image, (4, 3))plt.figure(figsize=(20,20))plt.subplot(221),plt.imshow(image, cmap='gray'),plt.title('Original')plt.xticks([]), plt.yticks([])plt.subplot(222),plt.imshow(image_rescaled, cmap='gray'),plt.title('Rescaled')plt.xticks([]), plt.yticks([])plt.subplot(223),plt.imshow(image_resized, cmap='gray'),plt.title('Resized')plt.xticks([]), plt.yticks([])plt.subplot(224),plt.imshow(image_downscaled, cmap='gray'),plt.title('Downscaled')plt.xticks([]), plt.yticks([])plt.show()

Scikit-image

4、 Python Imaging Library(PIL)

Python Imaging Library(PIL) 已經成為 Python 事實上的影象處理標準庫了，這是由於，PIL 功能非常強大，但API卻非常簡單易用。
但是由於PIL僅支援到 Python 2.7，再加上年久失修，於是一群志願者在 PIL 的基礎上建立了相容的版本，名字叫 Pillow，支援最新 Python 3.x，又加入了許多新特性，因此，我們可以跳過 PIL，直接安裝使用 Pillow。

5、 Pillow

使用 Pillow 生成字母驗證碼圖片：

from PIL import Image, ImageDraw, ImageFont, ImageFilterimport random# 隨機字母:def rndChar():
    return chr(random.randint(65, 90))# 隨機顏色1:def rndColor():
    return (random.randint(64, 255), random.randint(64, 255), random.randint(64, 255))# 隨機顏色2:def rndColor2():
    return (random.randint(32, 127), random.randint(32, 127), random.randint(32, 127))# 240 x 60:width = 60 * 6height = 60 * 6image = Image.new('RGB', (width, height), (255, 255, 255))# 建立Font物件:font = ImageFont.truetype('/usr/share/fonts/wps-office/simhei.ttf', 60)# 建立Draw物件:draw = ImageDraw.Draw(image)# 填充每個畫素:for x in range(width):
    for y in range(height):
        draw.point((x, y), fill=rndColor())# 輸出文字:for t in range(6):
    draw.text((60 * t + 10, 150), rndChar(), font=font, fill=rndColor2())# 模糊:image = image.filter(ImageFilter.BLUR)image.save('code.jpg', 'jpeg')

驗證碼

6、 SimpleCV

SimpleCV 是一個用於構建計算機視覺應用程式的開源框架。使用它，可以存取高效能的計算機視覺庫，如 OpenCV，而不必首先了解位深度、檔案格式、顏色空間、緩衝區管理、特徵值或矩陣等術語。但其對於 Python3 的支援很差很差，在 Python3.7 中使用如下程式碼：

from SimpleCV import Image, Color, Display# load an image from imgurimg = Image('http://i.imgur.com/lfAeZ4n.png')# use a keypoint detector to find areas of interestfeats = img.findKeypoints()# draw the list of keypointsfeats.draw(color=Color.RED)# show the  resulting image. img.show()# apply the stuff we found to the image.output = img.applyLayers()# save the results.output.save('juniperfeats.png')

會報如下錯誤，因此不建議在 Python3 中使用：

SyntaxError: Missing parentheses in call to 'print'. Did you mean print('unit test')?

7、 Mahotas

Mahotas 是一個快速計算機視覺演演算法庫，其構建在 Numpy 之上，目前擁有超過100種影象處理和計算機視覺功能，並在不斷增長。
使用 Mahotas 載入影象，並對畫素進行操作：

import numpy as npimport mahotasimport mahotas.demosfrom mahotas.thresholding import soft_thresholdfrom matplotlib import pyplot as pltfrom os import path
f = mahotas.demos.load('lena', as_grey=True)f = f[128:,128:]plt.gray()# Show the data:print("Fraction of zeros in original image: {0}".format(np.mean(f==0)))plt.imshow(f)plt.show()

Mahotas

8、 Ilastik

Ilastik 能夠給使用者提供良好的基於機器學習的生物資訊影象分析服務，利用機器學習演演算法，輕鬆地分割，分類，跟蹤和計數細胞或其他實驗資料。大多數操作都是互動式的，並不需要機器學習專業知識。可以參考https://www.ilastik.org/documentation/basics/installation.html進行安裝使用。

9、 Scikit-learn

Scikit-learn 是針對 Python 程式語言的免費軟體機器學習庫。它具有各種分類，迴歸和聚類演演算法，包括支援向量機，隨機森林，梯度提升，k均值和 DBSCAN 等多種機器學習演演算法。
使用Scikit-learn實現KMeans演演算法：

import timeimport numpy as npimport matplotlib.pyplot as pltfrom sklearn.cluster import MiniBatchKMeans, KMeansfrom sklearn.metrics.pairwise import pairwise_distances_argminfrom sklearn.datasets import make_blobs# Generate sample datanp.random.seed(0)batch_size = 45centers = [[1, 1], [-1, -1], [1, -1]]n_clusters = len(centers)X, labels_true = make_blobs(n_samples=3000, centers=centers, cluster_std=0.7)# Compute clustering with Meansk_means = KMeans(init='k-means++', n_clusters=3, n_init=10)t0 = time.time()k_means.fit(X)t_batch = time.time() - t0# Compute clustering with MiniBatchKMeansmbk = MiniBatchKMeans(init='k-means++', n_clusters=3, batch_size=batch_size,
                      n_init=10, max_no_improvement=10, verbose=0)t0 = time.time()mbk.fit(X)t_mini_batch = time.time() - t0# Plot resultfig = plt.figure(figsize=(8, 3))fig.subplots_adjust(left=0.02, right=0.98, bottom=0.05, top=0.9)colors = ['#4EACC5', '#FF9C34', '#4E9A06']# We want to have the same colors for the same cluster from the# MiniBatchKMeans and the KMeans algorithm. Let's pair the cluster centers per# closest one.k_means_cluster_centers = k_means.cluster_centers_
order = pairwise_distances_argmin(k_means.cluster_centers_,
                                  mbk.cluster_centers_)mbk_means_cluster_centers = mbk.cluster_centers_[order]k_means_labels = pairwise_distances_argmin(X, k_means_cluster_centers)mbk_means_labels = pairwise_distances_argmin(X, mbk_means_cluster_centers)# KMeansfor k, col in zip(range(n_clusters), colors):
    my_members = k_means_labels == k
    cluster_center = k_means_cluster_centers[k]
    plt.plot(X[my_members, 0], X[my_members, 1], 'w',
            markerfacecolor=col, marker='.')
    plt.plot(cluster_center[0], cluster_center[1], 'o', markerfacecolor=col,
            markeredgecolor='k', markersize=6)plt.title('KMeans')plt.xticks(())plt.yticks(())plt.show()

KMeans

10、 SciPy

SciPy 庫提供了許多使用者友好和高效的數值計算，如數值積分、插值、優化、線性代數等。
SciPy 庫定義了許多數學物理的特殊函數，包括橢圓函數、貝塞爾函數、伽馬函數、貝塔函數、超幾何函數、拋物線圓柱函數等等。

from scipy import specialimport matplotlib.pyplot as pltimport numpy as npdef drumhead_height(n, k, distance, angle, t):
    kth_zero = special.jn_zeros(n, k)[-1]
    return np.cos(t) * np.cos(n*angle) * special.jn(n, distance*kth_zero)theta = np.r_[0:2*np.pi:50j]radius = np.r_[0:1:50j]x = np.array([r * np.cos(theta) for r in radius])y = np.array([r * np.sin(theta) for r in radius])z = np.array([drumhead_height(1, 1, r, theta, 0.5) for r in radius])fig = plt.figure()ax = fig.add_axes(rect=(0, 0.05, 0.95, 0.95), projection='3d')ax.plot_surface(x, y, z, rstride=1, cstride=1, cmap='RdBu_r', vmin=-0.5, vmax=0.5)ax.set_xlabel('X')ax.set_ylabel('Y')ax.set_xticks(np.arange(-1, 1.1, 0.5))ax.set_yticks(np.arange(-1, 1.1, 0.5))ax.set_zlabel('Z')plt.show()

SciPy

11、 NLTK

NLTK 是構建Python程式以處理自然語言的庫。它為50多個語料庫和詞彙資源(如 WordNet )提供了易於使用的介面，以及一套用於分類、分詞、詞幹、標記、解析和語意推理的文書處理庫、工業級自然語言處理 (Natural Language Processing, NLP) 庫的包裝器。
NLTK被稱為 「a wonderful tool for teaching, and working in, computational linguistics using Python」。

import nltkfrom nltk.corpus import treebank# 首次使用需要下載nltk.download('punkt')nltk.download('averaged_perceptron_tagger')nltk.download('maxent_ne_chunker')nltk.download('words')nltk.download('treebank')sentence = """At eight o'clock on Thursday morning Arthur didn't feel very good."""# Tokenizetokens = nltk.word_tokenize(sentence)tagged = nltk.pos_tag(tokens)# Identify named entitiesentities = nltk.chunk.ne_chunk(tagged)# Display a parse treet = treebank.parsed_sents('wsj_0001.mrg')[0]t.draw()

NLTK

12、 spaCy

spaCy 是一個免費的開源庫，用於 Python 中的高階 NLP。它可以用於構建處理大量文字的應用程式；也可以用來構建資訊提取或自然語言理解系統，或者對文字進行預處理以進行深度學習。

  import spacy

  texts = [
      "Net income was $9.4 million compared to the prior year of $2.7 million.",
      "Revenue exceeded twelve billion dollars, with a loss of $1b.",
  ]

  nlp = spacy.load("en_core_web_sm")
  for doc in nlp.pipe(texts, disable=["tok2vec", "tagger", "parser", "attribute_ruler", "lemmatizer"]):
      # Do something with the doc here
      print([(ent.text, ent.label_) for ent in doc.ents])

nlp.pipe 生成 Doc 物件，因此我們可以對它們進行迭代並存取命名實體預測：

[('$9.4 million', 'MONEY'), ('the prior year', 'DATE'), ('$2.7 million', 'MONEY')][('twelve billion dollars', 'MONEY'), ('1b', 'MONEY')]

13、 LibROSA

librosa 是一個用於音樂和音訊分析的 Python 庫，它提供了建立音樂資訊檢索系統所必需的功能和函數。

# Beat tracking exampleimport librosa# 1. Get the file path to an included audio examplefilename = librosa.example('nutcracker')# 2. Load the audio as a waveform `y`#    Store the sampling rate as `sr`y, sr = librosa.load(filename)# 3. Run the default beat trackertempo, beat_frames = librosa.beat.beat_track(y=y, sr=sr)print('Estimated tempo: {:.2f} beats per minute'.format(tempo))# 4. Convert the frame indices of beat events into timestampsbeat_times = librosa.frames_to_time(beat_frames, sr=sr)

14、 Pandas

Pandas 是一個快速、強大、靈活且易於使用的開源資料分析和操作工具， Pandas 可以從各種檔案格式比如 CSV、JSON、SQL、Microsoft Excel 匯入資料，可以對各種資料進行運算操作，比如歸併、再成形、選擇，還有資料淨化和資料加工特徵。Pandas 廣泛應用在學術、金融、統計學等各個資料分析領域。

import matplotlib.pyplot as pltimport pandas as pdimport numpy as np

ts = pd.Series(np.random.randn(1000), index=pd.date_range("1/1/2000", periods=1000))ts = ts.cumsum()df = pd.DataFrame(np.random.randn(1000, 4), index=ts.index, columns=list("ABCD"))df = df.cumsum()df.plot()plt.show()

Pandas

15、 Matplotlib

Matplotlib 是Python的繪相簿，它提供了一整套和 matlab 相似的命令 API，可以生成出版品質級別的精美圖形，Matplotlib 使繪圖變得非常簡單，在易用性和效能間取得了優異的平衡。
使用 Matplotlib 繪製多曲線圖：

# plot_multi_curve.pyimport numpy as npimport matplotlib.pyplot as plt
x = np.linspace(0.1, 2 * np.pi, 100)y_1 = x
y_2 = np.square(x)y_3 = np.log(x)y_4 = np.sin(x)plt.plot(x,y_1)plt.plot(x,y_2)plt.plot(x,y_3)plt.plot(x,y_4)plt.show()

有關更多Matplotlib繪圖的介紹可以參考此前博文———Python-Matplotlib視覺化。

16、 Seaborn

Seaborn 是在 Matplotlib 的基礎上進行了更高階的API封裝的Python資料視覺化庫，從而使得作圖更加容易，應該把 Seaborn 視為 Matplotlib 的補充，而不是替代物。

import seaborn as snsimport matplotlib.pyplot as plt
sns.set_theme(style="ticks")df = sns.load_dataset("penguins")sns.pairplot(df, hue="species")plt.show()

seaborn

17、 Orange

Orange 是一個開源的資料探勘和機器學習軟體，提供了一系列的資料探索、視覺化、預處理以及建模元件。Orange 擁有漂亮直觀的互動式使用者介面，非常適合新手進行探索性資料分析和視覺化展示；同時高階使用者也可以將其作為 Python 的一個程式設計模組進行資料操作和元件開發。
使用 pip 即可安裝 Orange，好評～

$ pip install orange3

安裝完成後，在命令列輸入 orange-canvas 命令即可啟動 Orange 圖形介面：

$ orange-canvas

啟動完成後，即可看到 Orange 圖形介面，進行各種操作。

Orange

18、 PyBrain

PyBrain 是 Python 的模組化機器學習庫。它的目標是為機器學習任務和各種預定義的環境提供靈活、易於使用且強大的演演算法來測試和比較演演算法。PyBrain 是 Python-Based Reinforcement Learning, Artificial Intelligence and Neural Network Library 的縮寫。
我們將利用一個簡單的例子來展示 PyBrain 的用法，構建一個多層感知器 (Multi Layer Perceptron, MLP)。
首先，我們建立一個新的前饋網路物件：

from pybrain.structure import FeedForwardNetwork

n = FeedForwardNetwork()

接下來，構建輸入、隱藏和輸出層：

from pybrain.structure import LinearLayer, SigmoidLayer

inLayer = LinearLayer(2)hiddenLayer = SigmoidLayer(3)outLayer = LinearLayer(1)

為了使用所構建的層，必須將它們新增到網路中：

n.addInputModule(inLayer)n.addModule(hiddenLayer)n.addOutputModule(outLayer)

可以新增多個輸入和輸出模組。為了向前計算和反向誤差傳播，網路必須知道哪些層是輸入、哪些層是輸出。
這就需要明確確定它們應該如何連線。為此，我們使用最常見的連線型別，全連線層，由 FullConnection 類實現：

from pybrain.structure import FullConnection
in_to_hidden = FullConnection(inLayer, hiddenLayer)hidden_to_out = FullConnection(hiddenLayer, outLayer)

與層一樣，我們必須明確地將它們新增到網路中：

n.addConnection(in_to_hidden)n.addConnection(hidden_to_out)

所有元素現在都已準備就位，最後，我們需要呼叫.sortModules()方法使MLP可用：

n.sortModules()

這個呼叫會執行一些內部初始化，這在使用網路之前是必要的。

19、 Milk

MILK(MACHINE LEARNING TOOLKIT) 是 Python 語言的機器學習工具包。它主要是包含許多分類器比如 SVMS、K-NN、隨機森林以及決策樹中使用監督分類法，它還可執行特徵選擇，可以形成不同的例如無監督學習、密切關係傳播和由 MILK 支援的 K-means 聚類等分類系統。
使用 MILK 訓練一個分類器：

import numpy as npimport milk
features = np.random.rand(100,10)labels = np.zeros(100)features[50:] += .5labels[50:] = 1learner = milk.defaultclassifier()model = learner.train(features, labels)# Now you can use the model on new examples:example = np.random.rand(10)print(model.apply(example))example2 = np.random.rand(10)example2 += .5print(model.apply(example2))

20、 TensorFlow

TensorFlow 是一個端到端開源機器學習平臺。它擁有一個全面而靈活的生態系統，一般可以將其分為 TensorFlow1.x 和 TensorFlow2.x，TensorFlow1.x 與 TensorFlow2.x 的主要區別在於 TF1.x 使用靜態圖而 TF2.x 使用Eager Mode動態圖。
這裡主要使用TensorFlow2.x作為範例，展示在 TensorFlow2.x 中構建折積神經網路 (Convolutional Neural Network, CNN)。

import tensorflow as tffrom tensorflow.keras import datasets, layers, models# 資料載入(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()# 資料預處理train_images, test_images = train_images / 255.0, test_images / 255.0# 模型構建model = models.Sequential()model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))model.add(layers.MaxPooling2D((2, 2)))model.add(layers.Conv2D(64, (3, 3), activation='relu'))model.add(layers.MaxPooling2D((2, 2)))model.add(layers.Conv2D(64, (3, 3), activation='relu'))model.add(layers.Flatten())model.add(layers.Dense(64, activation='relu'))model.add(layers.Dense(10))# 模型編譯與訓練model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])history = model.fit(train_images, train_labels, epochs=10, 
                    validation_data=(test_images, test_labels))

想要了解更多Tensorflow2.x的範例，可以參考專欄 Tensorflow.

21、 PyTorch

PyTorch 的前身是 Torch，其底層和 Torch 框架一樣，但是使用 Python 重新寫了很多內容，不僅更加靈活，支援動態圖，而且提供了 Python 介面。

# 匯入庫import torchfrom torch import nnfrom torch.utils.data import DataLoaderfrom torchvision import datasetsfrom torchvision.transforms import ToTensor, Lambda, Composeimport matplotlib.pyplot as plt# 模型構建device = "cuda" if torch.cuda.is_available() else "cpu"print("Using {} device".format(device))# Define modelclass NeuralNetwork(nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10),
            nn.ReLU()
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork().to(device)# 損失函數和優化器loss_fn = nn.CrossEntropyLoss()optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)# 模型訓練def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)

        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if batch % 100 == 0:
            loss, current = loss.item(), batch * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")

22、 Theano

Theano 是一個 Python 庫，它允許定義、優化和有效地計算涉及多維陣列的數學表示式，建在 NumPy 之上。
在 Theano 中實現計算雅可比矩陣：

import theanoimport theano.tensor as T
x = T.dvector('x')y = x ** 2J, updates = theano.scan(lambda i, y,x : T.grad(y[i], x), sequences=T.arange(y.shape[0]), non_sequences=[y,x])f = theano.function([x], J, updates=updates)f([4, 4])

23、 Keras

Keras 是一個用 Python 編寫的高階神經網路 API，它能夠以 TensorFlow, CNTK, 或者 Theano 作為後端執行。Keras 的開發重點是支援快速的實驗，能夠以最小的時延把想法轉換為實驗結果。

from keras.models import Sequentialfrom keras.layers import Dense# 模型構建model = Sequential()model.add(Dense(units=64, activation='relu', input_dim=100))model.add(Dense(units=10, activation='softmax'))# 模型編譯與訓練model.compile(loss='categorical_crossentropy',
              optimizer='sgd',
              metrics=['accuracy'])model.fit(x_train, y_train, epochs=5, batch_size=32)

24、 Caffe

在 Caffe2 官方網站上，這樣說道：Caffe2 現在是 PyTorch 的一部分。雖然這些 api 將繼續工作，但鼓勵使用 PyTorch api。

25、 MXNet

MXNet 是一款設計為效率和靈活性的深度學習框架。它允許混合符號程式設計和指令式程式設計，從而最大限度提高效率和生產力。
使用 MXNet 構建手寫數位識別模型：

import mxnet as mxfrom mxnet import gluonfrom mxnet.gluon import nnfrom mxnet import autograd as agimport mxnet.ndarray as F# 資料載入mnist = mx.test_utils.get_mnist()batch_size = 100train_data = mx.io.NDArrayIter(mnist['train_data'], mnist['train_label'], batch_size, shuffle=True)val_data = mx.io.NDArrayIter(mnist['test_data'], mnist['test_label'], batch_size)# CNN模型class Net(gluon.Block):
    def __init__(self, **kwargs):
        super(Net, self).__init__(**kwargs)
        self.conv1 = nn.Conv2D(20, kernel_size=(5,5))
        self.pool1 = nn.MaxPool2D(pool_size=(2,2), strides = (2,2))
        self.conv2 = nn.Conv2D(50, kernel_size=(5,5))
        self.pool2 = nn.MaxPool2D(pool_size=(2,2), strides = (2,2))
        self.fc1 = nn.Dense(500)
        self.fc2 = nn.Dense(10)

    def forward(self, x):
        x = self.pool1(F.tanh(self.conv1(x)))
        x = self.pool2(F.tanh(self.conv2(x)))
        # 0 means copy over size from corresponding dimension.
        # -1 means infer size from the rest of dimensions.
        x = x.reshape((0, -1))
        x = F.tanh(self.fc1(x))
        x = F.tanh(self.fc2(x))
        return x
net = Net()# 初始化與優化器定義# set the context on GPU is available otherwise CPUctx = [mx.gpu() if mx.test_utils.list_gpus() else mx.cpu()]net.initialize(mx.init.Xavier(magnitude=2.24), ctx=ctx)trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 0.03})# 模型訓練# Use Accuracy as the evaluation metric.metric = mx.metric.Accuracy()softmax_cross_entropy_loss = gluon.loss.SoftmaxCrossEntropyLoss()for i in range(epoch):
    # Reset the train data iterator.
    train_data.reset()
    for batch in train_data:
        data = gluon.utils.split_and_load(batch.data[0], ctx_list=ctx, batch_axis=0)
        label = gluon.utils.split_and_load(batch.label[0], ctx_list=ctx, batch_axis=0)
        outputs = []
        # Inside training scope
        with ag.record():
            for x, y in zip(data, label):
                z = net(x)
                # Computes softmax cross entropy loss.
                loss = softmax_cross_entropy_loss(z, y)
                # Backpropogate the error for one iteration.
                loss.backward()
                outputs.append(z)
        metric.update(label, outputs)
        trainer.step(batch.data[0].shape[0])
    # Gets the evaluation result.
    name, acc = metric.get()
    # Reset evaluation result to initial state.
    metric.reset()
    print('training acc at epoch %d: %s=%f'%(i, name, acc))

26、 PaddlePaddle

飛槳 (PaddlePaddle) 以百度多年的深度學習技術研究和業務應用為基礎，集深度學習核心訓練和推理框架、基礎模型庫、端到端開發套件、豐富的工具元件於一體。是中國首個自主研發、功能完備、開源開放的產業級深度學習平臺。
使用 PaddlePaddle 實現 LeNtet5：

# 匯入需要的包import paddleimport numpy as npfrom paddle.nn import Conv2D, MaxPool2D, Linear## 組網import paddle.nn.functional as F# 定義 LeNet 網路結構class LeNet(paddle.nn.Layer):
    def __init__(self, num_classes=1):
        super(LeNet, self).__init__()
        # 建立折積和池化層
        # 建立第1個折積層
        self.conv1 = Conv2D(in_channels=1, out_channels=6, kernel_size=5)
        self.max_pool1 = MaxPool2D(kernel_size=2, stride=2)
        # 尺寸的邏輯：池化層未改變通道數；當前通道數為6
        # 建立第2個折積層
        self.conv2 = Conv2D(in_channels=6, out_channels=16, kernel_size=5)
        self.max_pool2 = MaxPool2D(kernel_size=2, stride=2)
        # 建立第3個折積層
        self.conv3 = Conv2D(in_channels=16, out_channels=120, kernel_size=4)
        # 尺寸的邏輯：輸入層將資料拉平[B,C,H,W] -> [B,C*H*W]
        # 輸入size是[28,28]，經過三次折積和兩次池化之後，C*H*W等於120
        self.fc1 = Linear(in_features=120, out_features=64)
        # 建立全連線層，第一個全連線層的輸出神經元個數為64， 第二個全連線層輸出神經元個數為分類標籤的類別數
        self.fc2 = Linear(in_features=64, out_features=num_classes)
    # 網路的前向計算過程
    def forward(self, x):
        x = self.conv1(x)
        # 每個折積層使用Sigmoid啟用函數，後面跟著一個2x2的池化
        x = F.sigmoid(x)
        x = self.max_pool1(x)
        x = F.sigmoid(x)
        x = self.conv2(x)
        x = self.max_pool2(x)
        x = self.conv3(x)
        # 尺寸的邏輯：輸入層將資料拉平[B,C,H,W] -> [B,C*H*W]
        x = paddle.reshape(x, [x.shape[0], -1])
        x = self.fc1(x)
        x = F.sigmoid(x)
        x = self.fc2(x)
        return x

27、 CNTK

CNTK(Cognitive Toolkit) 是一個深度學習工具包，通過有向圖將神經網路描述為一系列計算步驟。在這個有向圖中，葉節點表示輸入值或網路引數，而其他節點表示對其輸入的矩陣運算。CNTK 可以輕鬆地實現和組合流行的模型型別，如 CNN 等。
CNTK 用網路描述語言 (network description language, NDL) 描述一個神經網路。簡單的說，要描述輸入的 feature，輸入的 label，一些引數，引數和輸入之間的計算關係，以及目標節點是什麼。

NDLNetworkBuilder=[
    
    run=ndlLR
    
    ndlLR=[
      # sample and label dimensions
      SDim=$dimension$
      LDim=1
    
      features=Input(SDim, 1)
      labels=Input(LDim, 1)
    
      # parameters to learn
      B0 = Parameter(4) 
      W0 = Parameter(4, SDim)
      
      
      B = Parameter(LDim)
      W = Parameter(LDim, 4)
    
      # operations
      t0 = Times(W0, features)
      z0 = Plus(t0, B0)
      s0 = Sigmoid(z0)   
      
      t = Times(W, s0)
      z = Plus(t, B)
      s = Sigmoid(z)    
    
      LR = Logistic(labels, s)
      EP = SquareError(labels, s)
    
      # root nodes
      FeatureNodes=(features)
      LabelNodes=(labels)
      CriteriaNodes=(LR)
      EvalNodes=(EP)
      OutputNodes=(s,t,z,s0,W0)
    ]   
  ]

總結與分類

python 常用機器學習及深度學習庫總結

庫名	官方網站	簡介
NumPy	http://www.numpy.org/	提供對大型多維陣列的支援，NumPy是計算機視覺中的一個關鍵庫，因為影象可以表示為多維陣列，將影象表示為NumPy陣列有許多優點
OpenCV	https://opencv.org/	開源的計算機視覺庫
Scikit-image	https:// scikit-image.org/	影象處理演演算法的集合,由scikit-image操作的影象只能是NumPy陣列
Python Imaging Library(PIL)	http://www.pythonware.com/products/pil/	影象處理庫，提供強大的影象處理和圖形功能
Pillow	https://pillow.readthedocs.io/	PIL的一個分支
SimpleCV	http://simplecv.org/	計算機視覺框架，提供了處理影象處理的關鍵功能
Mahotas	https://mahotas.readthedocs.io/	提供了用於影象處理和計算機視覺的一組函數，它最初是為生物影象資訊學而設計的；但是，現在它在其他領域也發揮了重要作用，它完全基於numpy陣列作為其資料型別
Ilastik	http://ilastik.org/	使用者友好且簡單的互動式影象分割、分類和分析工具
Scikit-learn	http://scikit-learn.org/	機器學習庫，具有各種分類、迴歸和聚類演演算法
SciPy	https://www.scipy.org/	科學和技術計算庫
NLTK	https://www.nltk.org/	處理自然語言資料的庫和程式
spaCy	https://spacy.io/	開源軟體庫，用於Python中的高階自然語言處理
LibROSA	https://librosa.github.io/librosa/	用於音樂和音訊處理的庫
Pandas	https://pandas.pydata.org/	構建在NumPy之上的庫，提供高階資料計算工具和易於使用的資料結構
Matplotlib	https://matplotlib.org	繪相簿，它提供了一整套和 matlab 相似的命令 API，可以生成所需的出版品質級別的圖形
Seaborn	https://seaborn.pydata.org/	是建立在Matplotlib之上的繪相簿
Orange	https://orange.biolab.si/	面向新手和專家的開源機器學習和資料視覺化工具包
PyBrain	http://pybrain.org/	機器學習庫，為機器學習提供易於使用的最新演演算法
Milk	http://luispedro.org/software/milk/	機器學習工具箱，主要用於監督學習中的多分類問題
TensorFlow	https://www.tensorflow.org/	開源的機器學習和深度學習庫
PyTorch	https://pytorch.org/	開源的機器學習和深度學習庫
Theano	http://deeplearning.net/software/theano/	用於快速數學表示式、求值和計算的庫，已編譯為可在CPU和GPU架構上執行
Keras	https://keras.io/	高階深度學習庫，可以在 TensorFlow、CNTK、Theano 或 Microsoft Cognitive Toolkit 之上執行
Caffe2	https://caffe2.ai/	Caffe2 是一個兼具表現力、速度和模組性的深度學習框架，是 Caffe 的實驗性重構，能以更靈活的方式組織計算
MXNet	https://mxnet.apache.org/	設計為效率和靈活性的深度學習框架，允許混合符號程式設計和指令式程式設計
PaddlePaddle	https://www.paddlepaddle.org.cn	以百度多年的深度學習技術研究和業務應用為基礎，集深度學習核心訓練和推理框架、基礎模型庫、端到端開發套件、豐富的工具元件於一體
CNTK	https://cntk.ai/	深度學習工具包，通過有向圖將神經網路描述為一系列計算步驟。在這個有向圖中，葉節點表示輸入值或網路引數，而其他節點表示對其輸入的矩陣運算

分類

可以根據其主要用途將這些庫進行分類：

類別	庫
影象處理	NumPy、OpenCV、scikit image、PIL、Pillow、SimpleCV、Mahotas、ilastik
文書處理	NLTK、spaCy、NumPy、scikit learn、PyTorch
音訊處理	LibROSA
機器學習	pandas, scikit-learn, Orange, PyBrain, Milk
資料檢視	Matplotlib、Seaborn、scikit-learn、Orange
深度學習	TensorFlow、Pytorch、Theano、Keras、Caffe2、MXNet、PaddlePaddle、CNTK
科學計算	SciPy

推薦學習：

以上就是python常用機器學習及深度學習庫介紹（總結分享）的詳細內容，更多請關注TW511.COM其它相關文章！