neural-network – IT Nursery

Where do I call the BatchNormalization function in Keras?

May 31, 2022 by IT Nursery

If I want to use the BatchNormalization function in Keras, then do I need to call it once only at the beginning? I read this documentation for it: http://keras.io/layers/normalization/ I don’t see where I’m supposed to call it. Below is my code attempting to use it: model = Sequential() keras.layers.normalization.BatchNormalization(epsilon=1e-06, mode=0, momentum=0.9, weights=None) model.add(Dense(64, input_dim=14, … Read more

Why binary_crossentropy and categorical_crossentropy give different performances for the same problem?

May 30, 2022 by IT Nursery

I’m trying to train a CNN to categorize text by topic. When I use binary cross-entropy I get ~80% accuracy, with categorical cross-entropy I get ~50% accuracy. I don’t understand why this is. It’s a multiclass problem, doesn’t that mean that I have to use categorical cross-entropy and that the results with binary cross-entropy are … Read more

Ordering of batch normalization and dropout?

May 30, 2022 by IT Nursery

The original question was in regard to TensorFlow implementations specifically. However, the answers are for implementations in general. This general answer is also the correct answer for TensorFlow. When using batch normalization and dropout in TensorFlow (specifically using the contrib.layers) do I need to be worried about the ordering? It seems possible that if I … Read more

Why use softmax as opposed to standard normalization?

May 30, 2022 by IT Nursery

In the output layer of a neural network, it is typical to use the softmax function to approximate a probability distribution: This is expensive to compute because of the exponents. Why not simply perform a Z transform so that all outputs are positive, and then normalise just by dividing all outputs by the sum of … Read more

How to initialize weights in PyTorch?

May 29, 2022 by IT Nursery

How do I initialize weights and biases of a network (via e.g. He or Xavier initialization)? 10 Answers 10

Why do we need to call zero_grad() in PyTorch?

May 22, 2022 by IT Nursery

Why does zero_grad() need to be called during training? | zero_grad(self) | Sets gradients of all model parameters to zero. 5 Answers 5

How to interpret loss and accuracy for a machine learning model [closed]

May 21, 2022 by IT Nursery

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers. Want to improve this question? Update the question so it’s on-topic for Stack Overflow. Closed last year. Improve this question When I trained my neural network with Theano or Tensorflow, they will report a variable called “loss” per epoch. How … Read more

Extremely small or NaN values appear in training neural network

May 14, 2022 by IT Nursery

I’m trying to implement a neural network architecture in Haskell, and use it on MNIST. I’m using the hmatrix package for linear algebra. My training framework is built using the pipes package. My code compiles and doesn’t crash. But the problem is, certain combinations of layer size (say, 1000), minibatch size, and learning rate give … Read more

Keras input explanation: input_shape, units, batch_size, dim, etc

May 10, 2022 by IT Nursery

For any Keras layer (Layer class), can someone explain how to understand the difference between input_shape, units, dim, etc.? For example the doc says units specify the output shape of a layer. In the image of the neural net below hidden layer1 has 4 units. Does this directly translate to the units attribute of the … Read more

What is the meaning of the word logits in TensorFlow? [duplicate]

May 10, 2022 by IT Nursery

This question already has answers here: What are logits? What is the difference between softmax and softmax_cross_entropy_with_logits? (7 answers) Closed 1 year ago. In the following TensorFlow function, we must feed the activation of artificial neurons in the final layer. That I understand. But I don’t understand why it is called logits? Isn’t that a … Read more