Like with many other things that require some level of knowledge - it's very easy if you know what you are doing.
While "making" a Neural Network comes in different flavors and levels, they are all quite straightforward, given you have the necessary foundations. Actually, it's not much different than "making" a cake:
- Level 0: You can buy one from the bakery, and just eat it - similarly, there are deployed Neural Networks out there that you can play with in order to get some intuition on what they can do and how they work. Check out this one for example: Tensorflow — Neural Network Playground
- Level 1: You can buy a ready made cake in the supermarket, and just put it in the oven for a few minutes - similarly, you can load a pretrained model, and start running it. It's as simple as:
- from keras.applications.resnet50 import ResNet50
- model = ResNet50(weights='imagenet')
- preds = model.predict(someInage)
See Applications - Keras Documentation to learn more about loading pretrained images
- Level 2: You can take a bunch of ready made chocolate cakes, cut and paste them, and make a new cake out of it (trains are a hit!) - similarly, you can take that pretrained model you just loaded, chop it, maybe tweak it a bit, and reuse it for transfer learning. It can be as simple as chopping off the last layer (the classifying softmax), or loading the output of a specific layer - both can be done very easily:
- #load the model excluding the last layer
- model = VGG16(weights='imagenet', include_top=False)
- #load a specific layer output
- base_model = VGG19(weights='imagenet')
- model = Model(inputs=base_model.input, outputs=base_model.get_layer('block4_pool').output)
Even a full blown transfer learning solution is fairly easy to understand and write:
- from keras.applications.inception_v3 import InceptionV3
- from keras.preprocessing import image
- from keras.models import Model
- from keras.layers import Dense, GlobalAveragePooling2D
- from keras import backend as K
- # create the base pre-trained model
- base_model = InceptionV3(weights='imagenet', include_top=False)
- # add a global spatial average pooling layer
- x = base_model.output
- x = GlobalAveragePooling2D()(x)
- # let's add a fully-connected layer
- x = Dense(1024, activation='relu')(x)
- # and a logistic layer -- let's say we have 200 classes
- predictions = Dense(200, activation='softmax')(x)
- # this is the model we will train
- model = Model(inputs=base_model.input, outputs=predictions)
- # first: train only the top layers (which were randomly initialized)
- # i.e. freeze all convolutional InceptionV3 layers
- for layer in base_model.layers:
- layer.trainable = False
- # compile the model (should be done *after* setting layers to non-trainable)
- model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
- # train the model on the new data for a few epochs
- model.fit_generator(...)
- # at this point, the top layers are well trained and we can start fine-tuning
- # convolutional layers from inception V3. We will freeze the bottom N layers
- # and train the remaining top layers.
- # let's visualize layer names and layer indices to see how many layers
- # we should freeze:
- for i, layer in enumerate(base_model.layers):
- print(i, http://layer.name)
- # we chose to train the top 2 inception blocks, i.e. we will freeze
- # the first 249 layers and unfreeze the rest:
- for layer in model.layers[:249]:
- layer.trainable = False
- for layer in model.layers[249:]:
- layer.trainable = True
- # we need to recompile the model for these modifications to take effect
- # we use SGD with a low learning rate
- from keras.optimizers import SGD
- model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy')
- # we train our model again (this time fine-tuning the top 2 inception blocks
- # alongside the top Dense layers
- model.fit_generator(...)
Transferred chocolate cake
- Level 3: You can cook the cake yourself, following a recipe. It's up to you whether to use ready made (instant) ingredients, or prepare it all - the dough, the cream, the topping, etc all by yourself from the most basic ingredients. You can even grow some of them… - similarly, you can implement a pretty complex image classifying CNN in just a few simple lines in tf.keras, or you can go all in and implement everything from scratch using basic mathematical functions in vanilla numpy (or C…). If you take the keras path, even writing and running a VGG-like CNN takes only a few (simple and clear) lines of code:
- import numpy as np
- import keras
- from keras.models import Sequential
- from keras.layers import Dense, Dropout, Flatten
- from keras.layers import Conv2D, MaxPooling2D
- from keras.optimizers import SGD
- # Generate dummy data
- x_train = np.random.random((100, 100, 100, 3))
- y_train = keras.utils.to_categorical(np.random.randint(10, size=(100, 1)), num_classes=10)
- x_test = np.random.random((20, 100, 100, 3))
- y_test = keras.utils.to_categorical(np.random.randint(10, size=(20, 1)), num_classes=10)
- model = Sequential()
- # input: 100x100 images with 3 channels -> (100, 100, 3) tensors.
- # this applies 32 convolution filters of size 3x3 each.
- model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(100, 100, 3)))
- model.add(Conv2D(32, (3, 3), activation='relu'))
- model.add(MaxPooling2D(pool_size=(2, 2)))
- model.add(Dropout(0.25))
- model.add(Conv2D(64, (3, 3), activation='relu'))
- model.add(Conv2D(64, (3, 3), activation='relu'))
- model.add(MaxPooling2D(pool_size=(2, 2)))
- model.add(Dropout(0.25))
- model.add(Flatten())
- model.add(Dense(256, activation='relu'))
- model.add(Dropout(0.5))
- model.add(Dense(10, activation='softmax'))
- sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
- model.compile(loss='categorical_crossentropy', optimizer=sgd)
- model.fit(x_train, y_train, batch_size=32, epochs=10)
- score = model.evaluate(x_test, y_test, batch_size=32)
- Finally, you can decide to open your own bakery and sell your cakes, or write a cooking book - similarly, the Jedi level of ML is when you use it in production, or doing cutting edge research - this is when you need to go beyond "just" building the network, but also solve for getting good data, deciding on the best architecture, running it in production, updating the model, etc…
0 Comments:
Post a Comment