Tip of the Day

Do not go where the path may lead, go instead where there is no path and leave a trail.

How easy is it to make a neural network?

Like with many other things that require some level of knowledge - it's very easy if you know what you are doing.
While "making" a Neural Network comes in different flavors and levels, they are all quite straightforward, given you have the necessary foundations. Actually, it's not much different than "making" a cake:
  • Level 0: You can buy one from the bakery, and just eat it - similarly, there are deployed Neural Networks out there that you can play with in order to get some intuition on what they can do and how they work. Check out this one for example: Tensorflow — Neural Network Playground
  • Level 1: You can buy a ready made cake in the supermarket, and just put it in the oven for a few minutes - similarly, you can load a pretrained model, and start running it. It's as simple as:
  1. from keras.applications.resnet50 import ResNet50
  2.  
  3. model = ResNet50(weights='imagenet')
  4. preds = model.predict(someInage)
See Applications - Keras Documentation to learn more about loading pretrained images
  • Level 2: You can take a bunch of ready made chocolate cakes, cut and paste them, and make a new cake out of it (trains are a hit!) - similarly, you can take that pretrained model you just loaded, chop it, maybe tweak it a bit, and reuse it for transfer learning. It can be as simple as chopping off the last layer (the classifying softmax), or loading the output of a specific layer - both can be done very easily:
  1. #load the model excluding the last layer
  2. model = VGG16(weights='imagenet', include_top=False)
  3. #load a specific layer output
  4. base_model = VGG19(weights='imagenet')
  5. model = Model(inputs=base_model.input, outputs=base_model.get_layer('block4_pool').output)
Even a full blown transfer learning solution is fairly easy to understand and write:
  1. from keras.applications.inception_v3 import InceptionV3
  2. from keras.preprocessing import image
  3. from keras.models import Model
  4. from keras.layers import Dense, GlobalAveragePooling2D
  5. from keras import backend as K
  6.  
  7. # create the base pre-trained model
  8. base_model = InceptionV3(weights='imagenet', include_top=False)
  9.  
  10. # add a global spatial average pooling layer
  11. x = base_model.output
  12. x = GlobalAveragePooling2D()(x)
  13.  
  14. # let's add a fully-connected layer
  15. x = Dense(1024, activation='relu')(x)
  16.  
  17. # and a logistic layer -- let's say we have 200 classes
  18. predictions = Dense(200, activation='softmax')(x)
  19.  
  20. # this is the model we will train
  21. model = Model(inputs=base_model.input, outputs=predictions)
  22.  
  23. # first: train only the top layers (which were randomly initialized)
  24.  
  25. # i.e. freeze all convolutional InceptionV3 layers
  26. for layer in base_model.layers:
  27. layer.trainable = False
  28.  
  29. # compile the model (should be done *after* setting layers to non-trainable)
  30. model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
  31.  
  32. # train the model on the new data for a few epochs
  33. model.fit_generator(...)
  34.  
  35. # at this point, the top layers are well trained and we can start fine-tuning
  36. # convolutional layers from inception V3. We will freeze the bottom N layers
  37. # and train the remaining top layers.
  38. # let's visualize layer names and layer indices to see how many layers
  39. # we should freeze:
  40.  
  41. for i, layer in enumerate(base_model.layers):
  42. print(i, http://layer.name)
  43.  
  44. # we chose to train the top 2 inception blocks, i.e. we will freeze
  45. # the first 249 layers and unfreeze the rest:
  46. for layer in model.layers[:249]:
  47. layer.trainable = False
  48. for layer in model.layers[249:]:
  49. layer.trainable = True
  50.  
  51. # we need to recompile the model for these modifications to take effect
  52. # we use SGD with a low learning rate
  53.  
  54. from keras.optimizers import SGD
  55.  
  56. model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy')
  57.  
  58. # we train our model again (this time fine-tuning the top 2 inception blocks
  59. # alongside the top Dense layers
  60. model.fit_generator(...)
Transferred chocolate cake
  • Level 3: You can cook the cake yourself, following a recipe. It's up to you whether to use ready made (instant) ingredients, or prepare it all - the dough, the cream, the topping, etc all by yourself from the most basic ingredients. You can even grow some of them… - similarly, you can implement a pretty complex image classifying CNN in just a few simple lines in tf.keras, or you can go all in and implement everything from scratch using basic mathematical functions in vanilla numpy (or C…). If you take the keras path, even writing and running a VGG-like CNN takes only a few (simple and clear) lines of code:
  1. import numpy as np
  2. import keras
  3. from keras.models import Sequential
  4. from keras.layers import Dense, Dropout, Flatten
  5. from keras.layers import Conv2D, MaxPooling2D
  6. from keras.optimizers import SGD
  7. # Generate dummy data
  8.  
  9. x_train = np.random.random((100, 100, 100, 3))
  10. y_train = keras.utils.to_categorical(np.random.randint(10, size=(100, 1)), num_classes=10)
  11. x_test = np.random.random((20, 100, 100, 3))
  12. y_test = keras.utils.to_categorical(np.random.randint(10, size=(20, 1)), num_classes=10)
  13.  
  14. model = Sequential()
  15.  
  16. # input: 100x100 images with 3 channels -> (100, 100, 3) tensors.
  17. # this applies 32 convolution filters of size 3x3 each.
  18. model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(100, 100, 3)))
  19. model.add(Conv2D(32, (3, 3), activation='relu'))
  20. model.add(MaxPooling2D(pool_size=(2, 2)))
  21. model.add(Dropout(0.25))
  22. model.add(Conv2D(64, (3, 3), activation='relu'))
  23. model.add(Conv2D(64, (3, 3), activation='relu'))
  24. model.add(MaxPooling2D(pool_size=(2, 2)))
  25. model.add(Dropout(0.25))
  26. model.add(Flatten())
  27. model.add(Dense(256, activation='relu'))
  28. model.add(Dropout(0.5))
  29. model.add(Dense(10, activation='softmax'))
  30.  
  31. sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
  32.  
  33. model.compile(loss='categorical_crossentropy', optimizer=sgd)
  34.  
  35. model.fit(x_train, y_train, batch_size=32, epochs=10)
  36.  
  37. score = model.evaluate(x_test, y_test, batch_size=32)
  • Finally, you can decide to open your own bakery and sell your cakes, or write a cooking book - similarly, the Jedi level of ML is when you use it in production, or doing cutting edge research - this is when you need to go beyond "just" building the network, but also solve for getting good data, deciding on the best architecture, running it in production, updating the model, etc…
pc: Thanks to Yariv Adan, (works at Google Assistant) for this good article .
SHARE

Himanshu Rai

  • Image
  • Image
  • Image
  • Image
  • Image

0 Comments:

Post a Comment