Note
Click here to download the full example code
IMU Corrector Tutorial¶
Uncomment this if you’re using google colab to run this script
# !pip install pypose
# !pip install pykitti
In this tutorial, we will be implementing a simple IMUCorrector
using torch.nn
modules and pypose.IMUPreintegrator
.
The functionality of our IMUCorrector
is to take an input noisy IMU sensor reading,
and output the corrected IMU integration result.
In some way, IMUCorrector
is an improved IMUPreintegrator
.
We will show that, we can combine pypose.module.IMUPreintegrator
into network training smoothly.
Skip the first two part if you have seen it in the imu integrator tutorial
import torch
import pykitti
import numpy as np
import pypose as pp
from torch import nn
import tqdm, argparse
from datetime import datetime
import torch.utils.data as Data
from torch.optim.lr_scheduler import ReduceLROnPlateau
import matplotlib.pyplot as plt
from matplotlib.patches import Ellipse
from matplotlib.collections import PatchCollection
1. Dataset Defination¶
First we will define the KITTI_IMU
dataset as a data.Dataset
in torch, for easy usage.
We’re using the pykitti
package.
This package provides a minimal set of tools for working with the KITTI datasets.
To access a data sequence, use:
dataset = pykitti.raw(root, dataname, drive)
Some of the data attributes we used below are:
dataset.timestamps
: Timestamps are parsed into a list of datetime objectsdataset.oxts
: List of OXTS packets and 6-dof poses as named tuples
For more details about the data format, please refer to their github page here.
A sequence will be seperated into many segments. The number of segments is controlled by step_size
.
Each segment of the sequence will return the measurements like dt
, acc
, and gyro
for a few frames, defined by duration.
class KITTI_IMU(Data.Dataset):
def __init__(self, root, dataname, drive, duration=10, step_size=1, mode='train'):
super().__init__()
self.duration = duration
self.data = pykitti.raw(root, dataname, drive)
self.seq_len = len(self.data.timestamps) - 1
assert mode in ['evaluate', 'train',
'test'], "{} mode is not supported.".format(mode)
self.dt = torch.tensor([datetime.timestamp(self.data.timestamps[i+1]) -
datetime.timestamp(self.data.timestamps[i])
for i in range(self.seq_len)])
self.gyro = torch.tensor([[self.data.oxts[i].packet.wx,
self.data.oxts[i].packet.wy,
self.data.oxts[i].packet.wz]
for i in range(self.seq_len)])
self.acc = torch.tensor([[self.data.oxts[i].packet.ax,
self.data.oxts[i].packet.ay,
self.data.oxts[i].packet.az]
for i in range(self.seq_len)])
self.gt_rot = pp.euler2SO3(torch.tensor([[self.data.oxts[i].packet.roll,
self.data.oxts[i].packet.pitch,
self.data.oxts[i].packet.yaw]
for i in range(self.seq_len)]))
self.gt_vel = self.gt_rot @ torch.tensor([[self.data.oxts[i].packet.vf,
self.data.oxts[i].packet.vl,
self.data.oxts[i].packet.vu]
for i in range(self.seq_len)])
self.gt_pos = torch.tensor(
np.array([self.data.oxts[i].T_w_imu[0:3, 3] for i in range(self.seq_len)]))
start_frame = 0
end_frame = self.seq_len
if mode == 'train':
end_frame = np.floor(self.seq_len * 0.5).astype(int)
elif mode == 'test':
start_frame = np.floor(self.seq_len * 0.5).astype(int)
self.index_map = [i for i in range(
0, end_frame - start_frame - self.duration, step_size)]
def __len__(self):
return len(self.index_map)
def __getitem__(self, i):
frame_id = self.index_map[i]
end_frame_id = frame_id + self.duration
return {
'dt': self.dt[frame_id: end_frame_id],
'acc': self.acc[frame_id: end_frame_id],
'gyro': self.gyro[frame_id: end_frame_id],
'gyro': self.gyro[frame_id: end_frame_id],
'gt_pos': self.gt_pos[frame_id+1: end_frame_id+1],
'gt_rot': self.gt_rot[frame_id+1: end_frame_id+1],
'gt_vel': self.gt_vel[frame_id+1: end_frame_id+1],
'init_pos': self.gt_pos[frame_id][None, ...],
# TODO: the init rotation might be used in gravity compensation
'init_rot': self.gt_rot[frame_id: end_frame_id],
'init_vel': self.gt_vel[frame_id][None, ...],
}
def get_init_value(self):
return {'pos': self.gt_pos[:1],
'rot': self.gt_rot[:1],
'vel': self.gt_vel[:1]}
2. Utility Functions¶
These are several utility functions. You can skip to the parameter definations and come back when necessary.
imu_collate
¶
imu_collate
is used in batch operation, to stack data in multiple frames together.
def imu_collate(data):
acc = torch.stack([d['acc'] for d in data])
gyro = torch.stack([d['gyro'] for d in data])
gt_pos = torch.stack([d['gt_pos'] for d in data])
gt_rot = torch.stack([d['gt_rot'] for d in data])
gt_vel = torch.stack([d['gt_vel'] for d in data])
init_pos = torch.stack([d['init_pos'] for d in data])
init_rot = torch.stack([d['init_rot'] for d in data])
init_vel = torch.stack([d['init_vel'] for d in data])
dt = torch.stack([d['dt'] for d in data]).unsqueeze(-1)
return {
'dt': dt,
'acc': acc,
'gyro': gyro,
'gt_pos': gt_pos,
'gt_vel': gt_vel,
'gt_rot': gt_rot,
'init_pos': init_pos,
'init_vel': init_vel,
'init_rot': init_rot,
}
move_to
¶
move_to
used to move different object to CUDA device.
def move_to(obj, device):
if torch.is_tensor(obj):
return obj.to(device)
elif isinstance(obj, dict):
res = {}
for k, v in obj.items():
res[k] = move_to(v, device)
return res
elif isinstance(obj, list):
res = []
for v in obj:
res.append(move_to(v, device))
return res
else:
raise TypeError("Invalid type for move_to", obj)
plot_gaussian
¶
plot_gaussian
used to plot an ellipse measuring uncertainty,
bigger ellipse means bigger uncertainty.
def plot_gaussian(ax, means, covs, color=None, sigma=3):
''' Set specific color to show edges, otherwise same with facecolor.'''
ellipses = []
for i in range(len(means)):
eigvals, eigvecs = np.linalg.eig(covs[i])
axis = np.sqrt(eigvals) * sigma
slope = eigvecs[1][0] / eigvecs[1][1]
angle = 180.0 * np.arctan(slope) / np.pi
ellipses.append(Ellipse(means[i, 0:2], axis[0], axis[1], angle=angle))
ax.add_collection(PatchCollection(ellipses, edgecolors=color, linewidth=1))
3. Define IMU Corrector¶
- Here we define the
IMUCorrecter
module. It has two parts, thenet
and theimu
, net
is a network that resemble an autoencoder. It consists of a sequence of linear layer and activation layer. It will return the IMU measurements correction. Add this correction to the original IMU sensor data, we will get the corrected sensor reading.imu
is apypose.module.IMUPreintegrator
. Use the corrected sensor reading from previous step as the input to theIMUPreintegrator
, we can get a more accurate IMU integration result.
class IMUCorrector(nn.Module):
def __init__(self, size_list= [6, 64, 128, 128, 128, 6]):
super().__init__()
layers = []
self.size_list = size_list
for i in range(len(size_list) - 2):
layers.append(nn.Linear(size_list[i], size_list[i+1]))
layers.append(nn.GELU())
layers.append(nn.Linear(size_list[-2], size_list[-1]))
self.net = nn.Sequential(*layers)
self.imu = pp.module.IMUPreintegrator(reset=True, prop_cov=False)
def forward(self, data, init_state):
feature = torch.cat([data["acc"], data["gyro"]], dim = -1)
B, F = feature.shape[:2]
output = self.net(feature.reshape(B*F,6)).reshape(B, F, 6)
corrected_acc = output[...,:3] + data["acc"]
corrected_gyro = output[...,3:] + data["gyro"]
return self.imu(init_state = init_state,
dt = data['dt'],
gyro = corrected_gyro,
acc = corrected_acc,
rot = data['gt_rot'].contiguous())
4. Define the Loss Function¶
The loss function consists of two parts: position loss and rotation loss.
For position loss, we used torch.nn.functional.mse_loss
, which is the mean squared error.
See the docs
for more detail.
For rotation loss, we first compute pose error between the output rotation and the ground truth rotation, then taking the norm of the lie algebra of the pose error.
Finally, we add the two loss together as our combined loss.
def get_loss(inte_state, data):
pos_loss = torch.nn.functional.mse_loss(inte_state['pos'][:,-1,:], data['gt_pos'][:,-1,:])
rot_loss = (data['gt_rot'][:,-1,:] * inte_state['rot'][:,-1,:].Inv()).Log().norm()
loss = pos_loss + rot_loss
return loss, {'pos_loss': pos_loss, 'rot_loss': rot_loss}
5. Define the Training Process¶
- This is the training process, which has three steps:
Step 1: Run forward function, to get the current network output
Step 2: Collect loss, for doing backward in Step 3
Step 3: Get gradients and do optimization
def train(network, train_loader, epoch, optimizer, device="cuda:0"):
"""
Train network for one epoch using a specified data loader
Outputs all targets, predicts, predicted covariance params, and losses in numpy arrays
"""
network.train()
running_loss = 0
t_range = tqdm.tqdm(train_loader)
for i, data in enumerate(t_range):
# Step 1: Run forward function
data = move_to(data, device)
init_state = {
"pos": data['init_pos'],
"rot": data['init_rot'][:,:1,:],
"vel": data['init_vel'],}
state = network(data, init_state)
# Step 2: Collect loss
losses, _ = get_loss(state, data)
running_loss += losses.item()
# Step 3: Get gradients and do optimization
t_range.set_description(f'iteration: {i:04d}, losses: {losses:.06f}')
t_range.refresh()
losses.backward()
optimizer.step()
return (running_loss/i)
6. Define the Testing Process¶
- This is the testing process, which has two steps:
Step 1: Run forward function, to get the current network output
Step 2: Collect loss, to evaluate the network performance
def test(network, loader, device = "cuda:0"):
network.eval()
with torch.no_grad():
running_loss = 0
for i, data in enumerate(tqdm.tqdm(loader)):
# Step 1: Run forward function
data = move_to(data, device)
init_state = {
"pos": data['init_pos'],
"rot": data['init_rot'][:,:1,:],
"vel": data['init_vel'],}
state = network(data, init_state)
# Step 2: Collect loss
losses, _ = get_loss(state, data)
running_loss += losses.item()
print("the running loss of the test set %0.6f"%(running_loss/i))
return (running_loss/i)
7. Define Parameters¶
Here we define all the parameters we will use. See the help message for the usage of each parameter.
parser = argparse.ArgumentParser()
parser.add_argument("--device",
type=str,
default='cuda:0',
help="cuda or cpu")
parser.add_argument("--batch-size",
type=int,
default=4,
help="batch size")
parser.add_argument("--max_epoches",
type=int,
default=100,
help="max_epoches")
parser.add_argument("--dataroot",
type=str,
default='../dataset',
help="dataset location downloaded")
parser.add_argument("--dataname",
type=str,
default='2011_09_26',
help="dataset name")
parser.add_argument("--datadrive",
nargs='+',
type=str,
default=[ "0001"],
help="data sequences")
parser.add_argument('--load_ckpt',
default=False,
action="store_true")
args, unknown = parser.parse_known_args(); print(args)
Namespace(device='cuda:0', batch_size=4, max_epoches=100, dataroot='../dataset', dataname='2011_09_26', datadrive=['0001'], load_ckpt=False)
8. Define Dataloaders¶
train_dataset = KITTI_IMU(args.dataroot, args.dataname, args.datadrive[0],
duration=10, mode='train')
test_dataset = KITTI_IMU(args.dataroot, args.dataname, args.datadrive[0],
duration=10, mode='test')
train_loader = Data.DataLoader(dataset=train_dataset, batch_size=args.batch_size,
collate_fn=imu_collate, shuffle=True)
test_loader = Data.DataLoader(dataset=test_dataset, batch_size=args.batch_size,
collate_fn=imu_collate, shuffle=False)
9. Main Training Loop¶
Here we will run our main training loop. First, like in pytorch, we will define the network, optimizer and scheduler.
If you are not familiar with the process of training a network, we would recommand you reading one of the PyTorch tutorial, like this.
For each epoch, we run both the training and testing once and collect the running loss. We can see from the output message below: the running losss is reducing, which means our IMUCorrecter is working.
network = IMUCorrector().to(args.device)
optimizer = torch.optim.Adam(network.parameters(), lr = 5e-6) # to use with ViTs
scheduler = ReduceLROnPlateau(optimizer, 'min', factor = 0.1, patience = 10) # default setup
for epoch_i in range(args.max_epoches):
train_loss = train(network, train_loader, epoch_i, optimizer, device = args.device)
test_loss = test(network, test_loader, device = args.device)
scheduler.step(train_loss)
print("train loss: %f test loss: %f "%(train_loss, test_loss))
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.299206: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.299206: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.300126: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.300126: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.300553: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.300553: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.293976: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.293976: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.293690: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.293690: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.288384: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.288384: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.292710: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.292710: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.294848: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.294848: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.294848: 73%|#######2 | 8/11 [00:00<00:00, 74.22it/s]
iteration: 0008, losses: 0.290991: 73%|#######2 | 8/11 [00:00<00:00, 74.22it/s]
iteration: 0008, losses: 0.290991: 73%|#######2 | 8/11 [00:00<00:00, 74.22it/s]
iteration: 0009, losses: 0.287749: 73%|#######2 | 8/11 [00:00<00:00, 74.22it/s]
iteration: 0009, losses: 0.287749: 73%|#######2 | 8/11 [00:00<00:00, 74.22it/s]
iteration: 0010, losses: 0.248864: 73%|#######2 | 8/11 [00:00<00:00, 74.22it/s]
iteration: 0010, losses: 0.248864: 73%|#######2 | 8/11 [00:00<00:00, 74.22it/s]
iteration: 0010, losses: 0.248864: 100%|##########| 11/11 [00:00<00:00, 85.65it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 151.93it/s]
the running loss of the test set 0.313447
train loss: 0.319110 test loss: 0.313447
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.285986: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.285986: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.282041: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.282041: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.286413: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.286413: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.280192: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.280192: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.278516: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.278516: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.275917: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.275917: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.273201: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.273201: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.274058: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.274058: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.272992: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.272992: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.272992: 82%|########1 | 9/11 [00:00<00:00, 85.81it/s]
iteration: 0009, losses: 0.273937: 82%|########1 | 9/11 [00:00<00:00, 85.81it/s]
iteration: 0009, losses: 0.273937: 82%|########1 | 9/11 [00:00<00:00, 85.81it/s]
iteration: 0010, losses: 0.239127: 82%|########1 | 9/11 [00:00<00:00, 85.81it/s]
iteration: 0010, losses: 0.239127: 82%|########1 | 9/11 [00:00<00:00, 85.81it/s]
iteration: 0010, losses: 0.239127: 100%|##########| 11/11 [00:00<00:00, 75.88it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 211.28it/s]
the running loss of the test set 0.295152
train loss: 0.302238 test loss: 0.295152
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.270017: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.270017: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.266822: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.266822: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.265930: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.265930: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.262890: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.262890: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.261928: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.261928: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.261753: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.261753: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.256637: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.256637: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.262146: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.262146: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.252307: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.252307: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.253908: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.253908: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.253908: 91%|######### | 10/11 [00:00<00:00, 99.49it/s]
iteration: 0010, losses: 0.218251: 91%|######### | 10/11 [00:00<00:00, 99.49it/s]
iteration: 0010, losses: 0.218251: 91%|######### | 10/11 [00:00<00:00, 99.49it/s]
iteration: 0010, losses: 0.218251: 100%|##########| 11/11 [00:00<00:00, 102.81it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 253.77it/s]
the running loss of the test set 0.275216
train loss: 0.283259 test loss: 0.275216
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.248931: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.248931: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.246487: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.246487: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.244090: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.244090: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.244767: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.244767: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.246740: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.246740: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.243117: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.243117: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.241229: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.241229: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.237942: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.237942: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.235601: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.235601: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.235601: 82%|########1 | 9/11 [00:00<00:00, 78.78it/s]
iteration: 0009, losses: 0.232927: 82%|########1 | 9/11 [00:00<00:00, 78.78it/s]
iteration: 0009, losses: 0.232927: 82%|########1 | 9/11 [00:00<00:00, 78.78it/s]
iteration: 0010, losses: 0.211547: 82%|########1 | 9/11 [00:00<00:00, 78.78it/s]
iteration: 0010, losses: 0.211547: 82%|########1 | 9/11 [00:00<00:00, 78.78it/s]
iteration: 0010, losses: 0.211547: 100%|##########| 11/11 [00:00<00:00, 83.44it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 241.48it/s]
the running loss of the test set 0.254224
train loss: 0.263338 test loss: 0.254224
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.236652: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.236652: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.228599: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.228599: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.228709: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.228709: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.222794: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.222794: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.226223: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.226223: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.222300: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.222300: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.217385: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.217385: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.218240: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.218240: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.214193: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.214193: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.218399: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.218399: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.187245: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.187245: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.187245: 100%|##########| 11/11 [00:00<00:00, 103.71it/s]
iteration: 0010, losses: 0.187245: 100%|##########| 11/11 [00:00<00:00, 103.56it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 177.40it/s]
the running loss of the test set 0.232710
train loss: 0.242074 test loss: 0.232710
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.212669: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.212669: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.215685: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.215685: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.207338: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.207338: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.202692: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.202692: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.200854: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.200854: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.199938: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.199938: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.201804: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.201804: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.204362: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.204362: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.198220: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.198220: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.198220: 82%|########1 | 9/11 [00:00<00:00, 83.89it/s]
iteration: 0009, losses: 0.195698: 82%|########1 | 9/11 [00:00<00:00, 83.89it/s]
iteration: 0009, losses: 0.195698: 82%|########1 | 9/11 [00:00<00:00, 83.89it/s]
iteration: 0010, losses: 0.167064: 82%|########1 | 9/11 [00:00<00:00, 83.89it/s]
iteration: 0010, losses: 0.167064: 82%|########1 | 9/11 [00:00<00:00, 83.89it/s]
iteration: 0010, losses: 0.167064: 100%|##########| 11/11 [00:00<00:00, 90.52it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 186.87it/s]
the running loss of the test set 0.211048
train loss: 0.220632 test loss: 0.211048
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.192203: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.192203: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.191510: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.191510: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.187954: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.187954: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.184028: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.184028: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.190685: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.190685: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.184154: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.184154: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.185778: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.185778: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.178413: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.178413: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.178413: 73%|#######2 | 8/11 [00:00<00:00, 78.53it/s]
iteration: 0008, losses: 0.176472: 73%|#######2 | 8/11 [00:00<00:00, 78.53it/s]
iteration: 0008, losses: 0.176472: 73%|#######2 | 8/11 [00:00<00:00, 78.53it/s]
iteration: 0009, losses: 0.173393: 73%|#######2 | 8/11 [00:00<00:00, 78.53it/s]
iteration: 0009, losses: 0.173393: 73%|#######2 | 8/11 [00:00<00:00, 78.53it/s]
iteration: 0010, losses: 0.147828: 73%|#######2 | 8/11 [00:00<00:00, 78.53it/s]
iteration: 0010, losses: 0.147828: 73%|#######2 | 8/11 [00:00<00:00, 78.53it/s]
iteration: 0010, losses: 0.147828: 100%|##########| 11/11 [00:00<00:00, 88.67it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 180.11it/s]
the running loss of the test set 0.189462
train loss: 0.199242 test loss: 0.189462
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.173565: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.173565: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.169454: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.169454: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.165186: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.165186: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.159335: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.159335: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.168375: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.168375: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.162640: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.162640: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.162640: 55%|#####4 | 6/11 [00:00<00:00, 58.85it/s]
iteration: 0006, losses: 0.162674: 55%|#####4 | 6/11 [00:00<00:00, 58.85it/s]
iteration: 0006, losses: 0.162674: 55%|#####4 | 6/11 [00:00<00:00, 58.85it/s]
iteration: 0007, losses: 0.165029: 55%|#####4 | 6/11 [00:00<00:00, 58.85it/s]
iteration: 0007, losses: 0.165029: 55%|#####4 | 6/11 [00:00<00:00, 58.85it/s]
iteration: 0008, losses: 0.163277: 55%|#####4 | 6/11 [00:00<00:00, 58.85it/s]
iteration: 0008, losses: 0.163277: 55%|#####4 | 6/11 [00:00<00:00, 58.85it/s]
iteration: 0009, losses: 0.151499: 55%|#####4 | 6/11 [00:00<00:00, 58.85it/s]
iteration: 0009, losses: 0.151499: 55%|#####4 | 6/11 [00:00<00:00, 58.85it/s]
iteration: 0010, losses: 0.141346: 55%|#####4 | 6/11 [00:00<00:00, 58.85it/s]
iteration: 0010, losses: 0.141346: 55%|#####4 | 6/11 [00:00<00:00, 58.85it/s]
iteration: 0010, losses: 0.141346: 100%|##########| 11/11 [00:00<00:00, 78.56it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 256.57it/s]
the running loss of the test set 0.168015
train loss: 0.178238 test loss: 0.168015
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.148611: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.148611: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.152096: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.152096: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.156074: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.156074: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.146314: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.146314: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.140796: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.140796: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.143451: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.143451: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.144611: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.144611: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.136184: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.136184: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.140919: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.140919: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.140919: 82%|########1 | 9/11 [00:00<00:00, 83.87it/s]
iteration: 0009, losses: 0.137465: 82%|########1 | 9/11 [00:00<00:00, 83.87it/s]
iteration: 0009, losses: 0.137465: 82%|########1 | 9/11 [00:00<00:00, 83.87it/s]
iteration: 0010, losses: 0.125587: 82%|########1 | 9/11 [00:00<00:00, 83.87it/s]
iteration: 0010, losses: 0.125587: 82%|########1 | 9/11 [00:00<00:00, 83.87it/s]
iteration: 0010, losses: 0.125587: 100%|##########| 11/11 [00:00<00:00, 78.92it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 361.50it/s]
the running loss of the test set 0.146862
train loss: 0.157211 test loss: 0.146862
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.130732: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.130732: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.130403: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.130403: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.127034: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.127034: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.134334: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.134334: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.128720: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.128720: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.126910: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.126910: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.118896: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.118896: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.120953: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.120953: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.120953: 73%|#######2 | 8/11 [00:00<00:00, 79.09it/s]
iteration: 0008, losses: 0.126049: 73%|#######2 | 8/11 [00:00<00:00, 79.09it/s]
iteration: 0008, losses: 0.126049: 73%|#######2 | 8/11 [00:00<00:00, 79.09it/s]
iteration: 0009, losses: 0.116705: 73%|#######2 | 8/11 [00:00<00:00, 79.09it/s]
iteration: 0009, losses: 0.116705: 73%|#######2 | 8/11 [00:00<00:00, 79.09it/s]
iteration: 0010, losses: 0.103275: 73%|#######2 | 8/11 [00:00<00:00, 79.09it/s]
iteration: 0010, losses: 0.103275: 73%|#######2 | 8/11 [00:00<00:00, 79.09it/s]
iteration: 0010, losses: 0.103275: 100%|##########| 11/11 [00:00<00:00, 91.32it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 182.18it/s]
the running loss of the test set 0.126329
train loss: 0.136401 test loss: 0.126329
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.112587: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.112587: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.112408: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.112408: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.113911: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.113911: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.107583: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.107583: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.117529: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.117529: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.109132: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.109132: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.098946: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.098946: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.097891: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.097891: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.104212: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.104212: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.104212: 82%|########1 | 9/11 [00:00<00:00, 84.13it/s]
iteration: 0009, losses: 0.104201: 82%|########1 | 9/11 [00:00<00:00, 84.13it/s]
iteration: 0009, losses: 0.104201: 82%|########1 | 9/11 [00:00<00:00, 84.13it/s]
iteration: 0010, losses: 0.087202: 82%|########1 | 9/11 [00:00<00:00, 84.13it/s]
iteration: 0010, losses: 0.087202: 82%|########1 | 9/11 [00:00<00:00, 84.13it/s]
iteration: 0010, losses: 0.087202: 100%|##########| 11/11 [00:00<00:00, 91.81it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 183.63it/s]
the running loss of the test set 0.106817
train loss: 0.116560 test loss: 0.106817
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.102293: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.102293: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.090598: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.090598: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.092002: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.092002: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.089443: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.089443: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.095486: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.095486: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.089027: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.089027: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.088844: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.088844: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.085642: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.085642: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.086503: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.086503: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.082710: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.082710: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.082710: 91%|######### | 10/11 [00:00<00:00, 85.78it/s]
iteration: 0010, losses: 0.078762: 91%|######### | 10/11 [00:00<00:00, 85.78it/s]
iteration: 0010, losses: 0.078762: 91%|######### | 10/11 [00:00<00:00, 85.78it/s]
iteration: 0010, losses: 0.078762: 100%|##########| 11/11 [00:00<00:00, 88.15it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 183.03it/s]
the running loss of the test set 0.088984
train loss: 0.098131 test loss: 0.088984
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.076650: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.076650: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.078412: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.078412: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.079756: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.079756: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.079526: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.079526: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.081404: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.081404: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.069438: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.069438: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.076541: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.076541: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.071877: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.071877: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.070603: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.070603: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.070603: 82%|########1 | 9/11 [00:00<00:00, 83.82it/s]
iteration: 0009, losses: 0.070901: 82%|########1 | 9/11 [00:00<00:00, 83.82it/s]
iteration: 0009, losses: 0.070901: 82%|########1 | 9/11 [00:00<00:00, 83.82it/s]
iteration: 0010, losses: 0.059825: 82%|########1 | 9/11 [00:00<00:00, 83.82it/s]
iteration: 0010, losses: 0.059825: 82%|########1 | 9/11 [00:00<00:00, 83.82it/s]
iteration: 0010, losses: 0.059825: 100%|##########| 11/11 [00:00<00:00, 88.91it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 361.45it/s]
the running loss of the test set 0.074079
train loss: 0.081493 test loss: 0.074079
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.068454: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.068454: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.070304: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.070304: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.063605: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.063605: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.066925: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.066925: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.062110: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.062110: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.069830: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.069830: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.065209: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.065209: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.051501: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.051501: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.061460: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.061460: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.061460: 82%|########1 | 9/11 [00:00<00:00, 88.47it/s]
iteration: 0009, losses: 0.061530: 82%|########1 | 9/11 [00:00<00:00, 88.47it/s]
iteration: 0009, losses: 0.061530: 82%|########1 | 9/11 [00:00<00:00, 88.47it/s]
iteration: 0010, losses: 0.047427: 82%|########1 | 9/11 [00:00<00:00, 88.47it/s]
iteration: 0010, losses: 0.047427: 82%|########1 | 9/11 [00:00<00:00, 88.47it/s]
iteration: 0010, losses: 0.047427: 100%|##########| 11/11 [00:00<00:00, 79.23it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 288.88it/s]
the running loss of the test set 0.064345
train loss: 0.068835 test loss: 0.064345
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.066862: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.066862: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.061448: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.061448: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.050301: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.050301: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.065330: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.065330: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.061371: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.061371: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.051886: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.051886: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.062064: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.062064: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.057578: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.057578: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.047592: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.047592: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.047592: 82%|########1 | 9/11 [00:00<00:00, 87.37it/s]
iteration: 0009, losses: 0.048645: 82%|########1 | 9/11 [00:00<00:00, 87.37it/s]
iteration: 0009, losses: 0.048645: 82%|########1 | 9/11 [00:00<00:00, 87.37it/s]
iteration: 0010, losses: 0.052881: 82%|########1 | 9/11 [00:00<00:00, 87.37it/s]
iteration: 0010, losses: 0.052881: 82%|########1 | 9/11 [00:00<00:00, 87.37it/s]
iteration: 0010, losses: 0.052881: 100%|##########| 11/11 [00:00<00:00, 93.04it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 169.26it/s]
the running loss of the test set 0.062480
train loss: 0.062596 test loss: 0.062480
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.046240: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.046240: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.054048: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.054048: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.059302: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.059302: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.060702: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.060702: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.063208: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.063208: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.063234: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.063234: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.055930: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.055930: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.056020: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.056020: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.060961: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.060961: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.062120: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.062120: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.066409: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.066409: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.066409: 100%|##########| 11/11 [00:00<00:00, 91.10it/s]
iteration: 0010, losses: 0.066409: 100%|##########| 11/11 [00:00<00:00, 90.87it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 331.40it/s]
the running loss of the test set 0.068396
train loss: 0.064817 test loss: 0.068396
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.061323: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.061323: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.062432: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.062432: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.054127: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.054127: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.064007: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.064007: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.076985: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.076985: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.066494: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.066494: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.069142: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.069142: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.069142: 64%|######3 | 7/11 [00:00<00:00, 62.26it/s]
iteration: 0007, losses: 0.061364: 64%|######3 | 7/11 [00:00<00:00, 62.26it/s]
iteration: 0007, losses: 0.061364: 64%|######3 | 7/11 [00:00<00:00, 62.26it/s]
iteration: 0008, losses: 0.069430: 64%|######3 | 7/11 [00:00<00:00, 62.26it/s]
iteration: 0008, losses: 0.069430: 64%|######3 | 7/11 [00:00<00:00, 62.26it/s]
iteration: 0009, losses: 0.075199: 64%|######3 | 7/11 [00:00<00:00, 62.26it/s]
iteration: 0009, losses: 0.075199: 64%|######3 | 7/11 [00:00<00:00, 62.26it/s]
iteration: 0010, losses: 0.063954: 64%|######3 | 7/11 [00:00<00:00, 62.26it/s]
iteration: 0010, losses: 0.063954: 64%|######3 | 7/11 [00:00<00:00, 62.26it/s]
iteration: 0010, losses: 0.063954: 100%|##########| 11/11 [00:00<00:00, 78.38it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 179.70it/s]
the running loss of the test set 0.078923
train loss: 0.072446 test loss: 0.078923
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.076294: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.076294: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.074770: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.074770: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.076390: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.076390: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.079386: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.079386: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.073033: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.073033: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.073972: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.073972: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.071823: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.071823: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.077141: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.077141: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.078082: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.078082: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.078082: 82%|########1 | 9/11 [00:00<00:00, 85.78it/s]
iteration: 0009, losses: 0.082783: 82%|########1 | 9/11 [00:00<00:00, 85.78it/s]
iteration: 0009, losses: 0.082783: 82%|########1 | 9/11 [00:00<00:00, 85.78it/s]
iteration: 0010, losses: 0.076334: 82%|########1 | 9/11 [00:00<00:00, 85.78it/s]
iteration: 0010, losses: 0.076334: 82%|########1 | 9/11 [00:00<00:00, 85.78it/s]
iteration: 0010, losses: 0.076334: 100%|##########| 11/11 [00:00<00:00, 80.27it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 350.25it/s]
the running loss of the test set 0.091431
train loss: 0.084001 test loss: 0.091431
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.078563: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.078563: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.085481: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.085481: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.085929: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.085929: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.082398: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.082398: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.079654: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.079654: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.085445: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.085445: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.094175: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.094175: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.094175: 64%|######3 | 7/11 [00:00<00:00, 65.45it/s]
iteration: 0007, losses: 0.095211: 64%|######3 | 7/11 [00:00<00:00, 65.45it/s]
iteration: 0007, losses: 0.095211: 64%|######3 | 7/11 [00:00<00:00, 65.45it/s]
iteration: 0008, losses: 0.092269: 64%|######3 | 7/11 [00:00<00:00, 65.45it/s]
iteration: 0008, losses: 0.092269: 64%|######3 | 7/11 [00:00<00:00, 65.45it/s]
iteration: 0009, losses: 0.094152: 64%|######3 | 7/11 [00:00<00:00, 65.45it/s]
iteration: 0009, losses: 0.094152: 64%|######3 | 7/11 [00:00<00:00, 65.45it/s]
iteration: 0010, losses: 0.092603: 64%|######3 | 7/11 [00:00<00:00, 65.45it/s]
iteration: 0010, losses: 0.092603: 64%|######3 | 7/11 [00:00<00:00, 65.45it/s]
iteration: 0010, losses: 0.092603: 100%|##########| 11/11 [00:00<00:00, 65.90it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 367.93it/s]
the running loss of the test set 0.104514
train loss: 0.096588 test loss: 0.104514
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.096337: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.096337: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.096965: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.096965: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.096236: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.096236: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.094995: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.094995: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.098197: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.098197: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.103396: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.103396: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.107509: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.107509: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.102930: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.102930: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.102930: 73%|#######2 | 8/11 [00:00<00:00, 79.23it/s]
iteration: 0008, losses: 0.103733: 73%|#######2 | 8/11 [00:00<00:00, 79.23it/s]
iteration: 0008, losses: 0.103733: 73%|#######2 | 8/11 [00:00<00:00, 79.23it/s]
iteration: 0009, losses: 0.101060: 73%|#######2 | 8/11 [00:00<00:00, 79.23it/s]
iteration: 0009, losses: 0.101060: 73%|#######2 | 8/11 [00:00<00:00, 79.23it/s]
iteration: 0010, losses: 0.092422: 73%|#######2 | 8/11 [00:00<00:00, 79.23it/s]
iteration: 0010, losses: 0.092422: 73%|#######2 | 8/11 [00:00<00:00, 79.23it/s]
iteration: 0010, losses: 0.092422: 100%|##########| 11/11 [00:00<00:00, 91.14it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 177.14it/s]
the running loss of the test set 0.117363
train loss: 0.109378 test loss: 0.117363
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.105416: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.105416: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.107201: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.107201: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.112628: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.112628: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.111003: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.111003: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.105827: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.105827: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.109192: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.109192: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.115090: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.115090: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.112302: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.112302: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.111139: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.111139: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.111139: 82%|########1 | 9/11 [00:00<00:00, 83.89it/s]
iteration: 0009, losses: 0.116685: 82%|########1 | 9/11 [00:00<00:00, 83.89it/s]
iteration: 0009, losses: 0.116685: 82%|########1 | 9/11 [00:00<00:00, 83.89it/s]
iteration: 0010, losses: 0.111876: 82%|########1 | 9/11 [00:00<00:00, 83.89it/s]
iteration: 0010, losses: 0.111876: 82%|########1 | 9/11 [00:00<00:00, 83.89it/s]
iteration: 0010, losses: 0.111876: 100%|##########| 11/11 [00:00<00:00, 91.02it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 190.89it/s]
the running loss of the test set 0.129476
train loss: 0.121836 test loss: 0.129476
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.116557: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.116557: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.120670: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.120670: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.116449: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.116449: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.122299: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.122299: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.123480: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.123480: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.118714: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.118714: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.123100: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.123100: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.123865: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.123865: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.126611: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.126611: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.124369: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.124369: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.116838: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.116838: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.116838: 100%|##########| 11/11 [00:00<00:00, 106.88it/s]
iteration: 0010, losses: 0.116838: 100%|##########| 11/11 [00:00<00:00, 106.73it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 190.22it/s]
the running loss of the test set 0.140495
train loss: 0.133295 test loss: 0.140495
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.126652: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.126652: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.130636: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.130636: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.131283: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.131283: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.128645: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.128645: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.137103: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.137103: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.129446: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.129446: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.134166: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.134166: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.131614: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.131614: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.130953: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.130953: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.138736: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.138736: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.138736: 91%|######### | 10/11 [00:00<00:00, 99.26it/s]
iteration: 0010, losses: 0.115208: 91%|######### | 10/11 [00:00<00:00, 99.26it/s]
iteration: 0010, losses: 0.115208: 91%|######### | 10/11 [00:00<00:00, 99.26it/s]
iteration: 0010, losses: 0.115208: 100%|##########| 11/11 [00:00<00:00, 90.88it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 365.81it/s]
the running loss of the test set 0.150146
train loss: 0.143444 test loss: 0.150146
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.135979: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.135979: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.133076: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.133076: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.143011: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.143011: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.138150: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.138150: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.135708: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.135708: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142212: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142212: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.144436: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.144436: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140820: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140820: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140820: 73%|#######2 | 8/11 [00:00<00:00, 70.75it/s]
iteration: 0008, losses: 0.139386: 73%|#######2 | 8/11 [00:00<00:00, 70.75it/s]
iteration: 0008, losses: 0.139386: 73%|#######2 | 8/11 [00:00<00:00, 70.75it/s]
iteration: 0009, losses: 0.144428: 73%|#######2 | 8/11 [00:00<00:00, 70.75it/s]
iteration: 0009, losses: 0.144428: 73%|#######2 | 8/11 [00:00<00:00, 70.75it/s]
iteration: 0010, losses: 0.126220: 73%|#######2 | 8/11 [00:00<00:00, 70.75it/s]
iteration: 0010, losses: 0.126220: 73%|#######2 | 8/11 [00:00<00:00, 70.75it/s]
iteration: 0010, losses: 0.126220: 100%|##########| 11/11 [00:00<00:00, 79.75it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 185.07it/s]
the running loss of the test set 0.158207
train loss: 0.152343 test loss: 0.158207
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.140088: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.140088: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.147604: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.147604: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.143452: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.143452: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.143363: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.143363: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.145350: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.145350: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.144224: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.144224: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.145488: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.145488: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.145736: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.145736: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.152367: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.152367: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.153763: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.153763: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.134793: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.134793: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.134793: 100%|##########| 11/11 [00:00<00:00, 91.10it/s]
iteration: 0010, losses: 0.134793: 100%|##########| 11/11 [00:00<00:00, 90.98it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 254.76it/s]
the running loss of the test set 0.164489
train loss: 0.159623 test loss: 0.164489
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.150128: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.150128: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.146342: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.146342: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.150295: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.150295: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.155500: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.155500: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.150649: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.150649: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.149775: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.149775: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.152384: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.152384: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.153298: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.153298: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.153298: 73%|#######2 | 8/11 [00:00<00:00, 79.81it/s]
iteration: 0008, losses: 0.154586: 73%|#######2 | 8/11 [00:00<00:00, 79.81it/s]
iteration: 0008, losses: 0.154586: 73%|#######2 | 8/11 [00:00<00:00, 79.81it/s]
iteration: 0009, losses: 0.150456: 73%|#######2 | 8/11 [00:00<00:00, 79.81it/s]
iteration: 0009, losses: 0.150456: 73%|#######2 | 8/11 [00:00<00:00, 79.81it/s]
iteration: 0010, losses: 0.135508: 73%|#######2 | 8/11 [00:00<00:00, 79.81it/s]
iteration: 0010, losses: 0.135508: 73%|#######2 | 8/11 [00:00<00:00, 79.81it/s]
iteration: 0010, losses: 0.135508: 100%|##########| 11/11 [00:00<00:00, 89.12it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 238.45it/s]
the running loss of the test set 0.168834
train loss: 0.164892 test loss: 0.168834
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.156171: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.156171: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.154033: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.154033: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.156807: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.156807: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.152150: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.152150: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.153578: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.153578: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.152711: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.152711: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.153379: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.153379: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.153377: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.153377: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.155579: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.155579: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.155579: 82%|########1 | 9/11 [00:00<00:00, 69.70it/s]
iteration: 0009, losses: 0.151356: 82%|########1 | 9/11 [00:00<00:00, 69.70it/s]
iteration: 0009, losses: 0.151356: 82%|########1 | 9/11 [00:00<00:00, 69.70it/s]
iteration: 0010, losses: 0.132382: 82%|########1 | 9/11 [00:00<00:00, 69.70it/s]
iteration: 0010, losses: 0.132382: 82%|########1 | 9/11 [00:00<00:00, 69.70it/s]
iteration: 0010, losses: 0.132382: 100%|##########| 11/11 [00:00<00:00, 74.65it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 348.59it/s]
the running loss of the test set 0.169061
train loss: 0.167152 test loss: 0.169061
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.156114: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.156114: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.150304: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.150304: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.155756: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.155756: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.153195: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.153195: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.151565: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.151565: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.153062: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.153062: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.156645: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.156645: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.149183: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.149183: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.158133: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.158133: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.158133: 82%|########1 | 9/11 [00:00<00:00, 86.02it/s]
iteration: 0009, losses: 0.151356: 82%|########1 | 9/11 [00:00<00:00, 86.02it/s]
iteration: 0009, losses: 0.151356: 82%|########1 | 9/11 [00:00<00:00, 86.02it/s]
iteration: 0010, losses: 0.139301: 82%|########1 | 9/11 [00:00<00:00, 86.02it/s]
iteration: 0010, losses: 0.139301: 82%|########1 | 9/11 [00:00<00:00, 86.02it/s]
iteration: 0010, losses: 0.139301: 100%|##########| 11/11 [00:00<00:00, 92.88it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 191.72it/s]
the running loss of the test set 0.169075
train loss: 0.167462 test loss: 0.169075
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.153920: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.153920: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.155429: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.155429: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.152408: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.152408: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.158795: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.158795: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.155310: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.155310: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.152439: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.152439: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.150898: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.150898: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.154012: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.154012: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.153418: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.153418: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.153418: 82%|########1 | 9/11 [00:00<00:00, 84.11it/s]
iteration: 0009, losses: 0.154649: 82%|########1 | 9/11 [00:00<00:00, 84.11it/s]
iteration: 0009, losses: 0.154649: 82%|########1 | 9/11 [00:00<00:00, 84.11it/s]
iteration: 0010, losses: 0.130191: 82%|########1 | 9/11 [00:00<00:00, 84.11it/s]
iteration: 0010, losses: 0.130191: 82%|########1 | 9/11 [00:00<00:00, 84.11it/s]
iteration: 0010, losses: 0.130191: 100%|##########| 11/11 [00:00<00:00, 90.17it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 186.73it/s]
the running loss of the test set 0.168873
train loss: 0.167147 test loss: 0.168873
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.152105: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.152105: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.157480: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.157480: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.156845: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.156845: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.153218: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.153218: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.157811: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.157811: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.148738: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.148738: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.153560: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.153560: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.149901: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.149901: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.153833: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.153833: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.151830: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.151830: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.151830: 91%|######### | 10/11 [00:00<00:00, 88.09it/s]
iteration: 0010, losses: 0.134336: 91%|######### | 10/11 [00:00<00:00, 88.09it/s]
iteration: 0010, losses: 0.134336: 91%|######### | 10/11 [00:00<00:00, 88.09it/s]
iteration: 0010, losses: 0.134336: 100%|##########| 11/11 [00:00<00:00, 91.58it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 243.55it/s]
the running loss of the test set 0.168456
train loss: 0.166966 test loss: 0.168456
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.153396: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.153396: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.150904: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.150904: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.151907: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.151907: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.157253: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.157253: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.159047: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.159047: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.151941: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.151941: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.151354: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.151354: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.150719: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.150719: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.150353: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.150353: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.150353: 82%|########1 | 9/11 [00:00<00:00, 86.01it/s]
iteration: 0009, losses: 0.151921: 82%|########1 | 9/11 [00:00<00:00, 86.01it/s]
iteration: 0009, losses: 0.151921: 82%|########1 | 9/11 [00:00<00:00, 86.01it/s]
iteration: 0010, losses: 0.135562: 82%|########1 | 9/11 [00:00<00:00, 86.01it/s]
iteration: 0010, losses: 0.135562: 82%|########1 | 9/11 [00:00<00:00, 86.01it/s]
iteration: 0010, losses: 0.135562: 100%|##########| 11/11 [00:00<00:00, 82.28it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 244.14it/s]
the running loss of the test set 0.167827
train loss: 0.166436 test loss: 0.167827
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.151329: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.151329: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.155242: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.155242: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.153846: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.153846: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.148325: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.148325: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.154395: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.154395: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.153641: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.153641: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.151978: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.151978: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.147988: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.147988: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.150528: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.150528: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.151626: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.151626: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.151626: 91%|######### | 10/11 [00:00<00:00, 99.92it/s]
iteration: 0010, losses: 0.139228: 91%|######### | 10/11 [00:00<00:00, 99.92it/s]
iteration: 0010, losses: 0.139228: 91%|######### | 10/11 [00:00<00:00, 99.92it/s]
iteration: 0010, losses: 0.139228: 100%|##########| 11/11 [00:00<00:00, 90.89it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 178.55it/s]
the running loss of the test set 0.166991
train loss: 0.165813 test loss: 0.166991
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.150114: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.150114: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.148636: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.148636: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.150640: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.150640: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.150756: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.150756: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.153690: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.153690: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.150230: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.150230: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.151352: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.151352: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.152860: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.152860: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.153802: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.153802: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.154445: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.154445: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.131607: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.131607: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.131607: 100%|##########| 11/11 [00:00<00:00, 103.96it/s]
iteration: 0010, losses: 0.131607: 100%|##########| 11/11 [00:00<00:00, 103.81it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 183.89it/s]
the running loss of the test set 0.165956
train loss: 0.164813 test loss: 0.165956
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.152993: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.152993: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.151913: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.151913: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.150977: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.150977: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.156742: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.156742: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.147761: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.147761: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.146633: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.146633: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.151765: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.151765: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.146716: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.146716: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.149525: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.149525: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.151221: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.151221: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.130850: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.130850: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.130850: 100%|##########| 11/11 [00:00<00:00, 104.83it/s]
iteration: 0010, losses: 0.130850: 100%|##########| 11/11 [00:00<00:00, 104.63it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 251.43it/s]
the running loss of the test set 0.164734
train loss: 0.163709 test loss: 0.164734
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.152149: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.152149: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.147092: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.147092: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.148405: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.148405: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.152592: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.152592: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.148223: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.148223: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.151483: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.151483: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.148808: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.148808: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.148334: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.148334: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.147869: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.147869: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.147869: 82%|########1 | 9/11 [00:00<00:00, 85.69it/s]
iteration: 0009, losses: 0.145979: 82%|########1 | 9/11 [00:00<00:00, 85.69it/s]
iteration: 0009, losses: 0.145979: 82%|########1 | 9/11 [00:00<00:00, 85.69it/s]
iteration: 0010, losses: 0.134784: 82%|########1 | 9/11 [00:00<00:00, 85.69it/s]
iteration: 0010, losses: 0.134784: 82%|########1 | 9/11 [00:00<00:00, 85.69it/s]
iteration: 0010, losses: 0.134784: 100%|##########| 11/11 [00:00<00:00, 92.84it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 184.61it/s]
the running loss of the test set 0.163337
train loss: 0.162572 test loss: 0.163337
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.151284: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.151284: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.149129: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.149129: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.156874: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.156874: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.144536: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.144536: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.144972: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.144972: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.148982: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.148982: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.146560: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.146560: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.148403: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.148403: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.148403: 73%|#######2 | 8/11 [00:00<00:00, 76.60it/s]
iteration: 0008, losses: 0.145420: 73%|#######2 | 8/11 [00:00<00:00, 76.60it/s]
iteration: 0008, losses: 0.145420: 73%|#######2 | 8/11 [00:00<00:00, 76.60it/s]
iteration: 0009, losses: 0.145588: 73%|#######2 | 8/11 [00:00<00:00, 76.60it/s]
iteration: 0009, losses: 0.145588: 73%|#######2 | 8/11 [00:00<00:00, 76.60it/s]
iteration: 0010, losses: 0.127750: 73%|#######2 | 8/11 [00:00<00:00, 76.60it/s]
iteration: 0010, losses: 0.127750: 73%|#######2 | 8/11 [00:00<00:00, 76.60it/s]
iteration: 0010, losses: 0.127750: 100%|##########| 11/11 [00:00<00:00, 79.19it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 233.91it/s]
the running loss of the test set 0.161780
train loss: 0.160950 test loss: 0.161780
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.145351: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.145351: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.152932: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.152932: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.148416: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.148416: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.150428: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.150428: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.146949: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.146949: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.148915: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.148915: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.142098: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.142098: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143890: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143890: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.144158: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.144158: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.143796: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.143796: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.126251: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.126251: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.126251: 100%|##########| 11/11 [00:00<00:00, 105.90it/s]
iteration: 0010, losses: 0.126251: 100%|##########| 11/11 [00:00<00:00, 105.66it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 186.01it/s]
the running loss of the test set 0.160078
train loss: 0.159319 test loss: 0.160078
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.148096: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.148096: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.146372: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.146372: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.147710: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.147710: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142027: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142027: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.147475: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.147475: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.145814: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.145814: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.143676: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.143676: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.145261: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.145261: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.145261: 73%|#######2 | 8/11 [00:00<00:00, 79.29it/s]
iteration: 0008, losses: 0.146123: 73%|#######2 | 8/11 [00:00<00:00, 79.29it/s]
iteration: 0008, losses: 0.146123: 73%|#######2 | 8/11 [00:00<00:00, 79.29it/s]
iteration: 0009, losses: 0.145551: 73%|#######2 | 8/11 [00:00<00:00, 79.29it/s]
iteration: 0009, losses: 0.145551: 73%|#######2 | 8/11 [00:00<00:00, 79.29it/s]
iteration: 0010, losses: 0.124971: 73%|#######2 | 8/11 [00:00<00:00, 79.29it/s]
iteration: 0010, losses: 0.124971: 73%|#######2 | 8/11 [00:00<00:00, 79.29it/s]
iteration: 0010, losses: 0.124971: 100%|##########| 11/11 [00:00<00:00, 81.50it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 364.66it/s]
the running loss of the test set 0.159894
train loss: 0.158308 test loss: 0.159894
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.149356: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.149356: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.142194: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.142194: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.150209: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.150209: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.145195: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.145195: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.142505: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.142505: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.146016: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.146016: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.142433: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.142433: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142658: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142658: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.144366: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.144366: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.144366: 82%|########1 | 9/11 [00:00<00:00, 83.98it/s]
iteration: 0009, losses: 0.148367: 82%|########1 | 9/11 [00:00<00:00, 83.98it/s]
iteration: 0009, losses: 0.148367: 82%|########1 | 9/11 [00:00<00:00, 83.98it/s]
iteration: 0010, losses: 0.128413: 82%|########1 | 9/11 [00:00<00:00, 83.98it/s]
iteration: 0010, losses: 0.128413: 82%|########1 | 9/11 [00:00<00:00, 83.98it/s]
iteration: 0010, losses: 0.128413: 100%|##########| 11/11 [00:00<00:00, 90.11it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 184.24it/s]
the running loss of the test set 0.159699
train loss: 0.158171 test loss: 0.159699
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.144571: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.144571: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.142122: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.142122: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.144863: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.144863: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.144839: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.144839: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.142969: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.142969: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.146011: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.146011: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.149970: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.149970: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.148549: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.148549: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141513: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141513: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141513: 82%|########1 | 9/11 [00:00<00:00, 87.40it/s]
iteration: 0009, losses: 0.147090: 82%|########1 | 9/11 [00:00<00:00, 87.40it/s]
iteration: 0009, losses: 0.147090: 82%|########1 | 9/11 [00:00<00:00, 87.40it/s]
iteration: 0010, losses: 0.127453: 82%|########1 | 9/11 [00:00<00:00, 87.40it/s]
iteration: 0010, losses: 0.127453: 82%|########1 | 9/11 [00:00<00:00, 87.40it/s]
iteration: 0010, losses: 0.127453: 100%|##########| 11/11 [00:00<00:00, 93.25it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 361.29it/s]
the running loss of the test set 0.159493
train loss: 0.157995 test loss: 0.159493
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.144717: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.144717: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.144641: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.144641: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.144201: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.144201: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.145869: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.145869: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.145609: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.145609: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.140880: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.140880: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.148887: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.148887: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.144380: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.144380: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.144380: 73%|#######2 | 8/11 [00:00<00:00, 72.06it/s]
iteration: 0008, losses: 0.146893: 73%|#######2 | 8/11 [00:00<00:00, 72.06it/s]
iteration: 0008, losses: 0.146893: 73%|#######2 | 8/11 [00:00<00:00, 72.06it/s]
iteration: 0009, losses: 0.147090: 73%|#######2 | 8/11 [00:00<00:00, 72.06it/s]
iteration: 0009, losses: 0.147090: 73%|#######2 | 8/11 [00:00<00:00, 72.06it/s]
iteration: 0010, losses: 0.123721: 73%|#######2 | 8/11 [00:00<00:00, 72.06it/s]
iteration: 0010, losses: 0.123721: 73%|#######2 | 8/11 [00:00<00:00, 72.06it/s]
iteration: 0010, losses: 0.123721: 100%|##########| 11/11 [00:00<00:00, 82.83it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 184.56it/s]
the running loss of the test set 0.159277
train loss: 0.157689 test loss: 0.159277
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.147223: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.147223: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.141434: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.141434: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.145635: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.145635: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.141822: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.141822: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.146172: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.146172: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.140953: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.140953: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.146074: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.146074: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.147929: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.147929: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.147657: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.147657: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.142199: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.142199: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.128812: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.128812: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.128812: 100%|##########| 11/11 [00:00<00:00, 94.52it/s]
iteration: 0010, losses: 0.128812: 100%|##########| 11/11 [00:00<00:00, 94.34it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 241.17it/s]
the running loss of the test set 0.159053
train loss: 0.157591 test loss: 0.159053
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.146868: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.146868: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.146053: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.146053: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142472: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142472: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.141248: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.141248: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.142822: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.142822: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142209: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142209: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.144675: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.144675: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143876: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143876: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.146293: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.146293: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.145098: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.145098: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.132456: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.132456: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.132456: 100%|##########| 11/11 [00:00<00:00, 107.88it/s]
iteration: 0010, losses: 0.132456: 100%|##########| 11/11 [00:00<00:00, 107.71it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 178.73it/s]
the running loss of the test set 0.158822
train loss: 0.157407 test loss: 0.158822
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.143955: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.143955: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.143095: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.143095: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.144462: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.144462: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.148772: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.148772: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.142935: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.142935: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.143608: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.143608: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.143890: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.143890: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142942: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142942: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142942: 73%|#######2 | 8/11 [00:00<00:00, 78.72it/s]
iteration: 0008, losses: 0.145602: 73%|#######2 | 8/11 [00:00<00:00, 78.72it/s]
iteration: 0008, losses: 0.145602: 73%|#######2 | 8/11 [00:00<00:00, 78.72it/s]
iteration: 0009, losses: 0.144574: 73%|#######2 | 8/11 [00:00<00:00, 78.72it/s]
iteration: 0009, losses: 0.144574: 73%|#######2 | 8/11 [00:00<00:00, 78.72it/s]
iteration: 0010, losses: 0.127424: 73%|#######2 | 8/11 [00:00<00:00, 78.72it/s]
iteration: 0010, losses: 0.127424: 73%|#######2 | 8/11 [00:00<00:00, 78.72it/s]
iteration: 0010, losses: 0.127424: 100%|##########| 11/11 [00:00<00:00, 89.35it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 252.36it/s]
the running loss of the test set 0.158585
train loss: 0.157126 test loss: 0.158585
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142366: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142366: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.139528: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.139528: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.145957: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.145957: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.141366: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.141366: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.149768: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.149768: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142049: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142049: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.146685: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.146685: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143344: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143344: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.147672: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.147672: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.144595: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.144595: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.144595: 91%|######### | 10/11 [00:00<00:00, 78.24it/s]
iteration: 0010, losses: 0.124965: 91%|######### | 10/11 [00:00<00:00, 78.24it/s]
iteration: 0010, losses: 0.124965: 91%|######### | 10/11 [00:00<00:00, 78.24it/s]
iteration: 0010, losses: 0.124965: 100%|##########| 11/11 [00:00<00:00, 78.03it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 208.86it/s]
the running loss of the test set 0.158342
train loss: 0.156830 test loss: 0.158342
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.141712: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.141712: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.145025: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.145025: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142295: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142295: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.144204: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.144204: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.141404: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.141404: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.143934: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.143934: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.145742: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.145742: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143422: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143422: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.146308: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.146308: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.145457: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.145457: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.145457: 91%|######### | 10/11 [00:00<00:00, 86.26it/s]
iteration: 0010, losses: 0.126249: 91%|######### | 10/11 [00:00<00:00, 86.26it/s]
iteration: 0010, losses: 0.126249: 91%|######### | 10/11 [00:00<00:00, 86.26it/s]
iteration: 0010, losses: 0.126249: 100%|##########| 11/11 [00:00<00:00, 78.31it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 373.08it/s]
the running loss of the test set 0.158094
train loss: 0.156575 test loss: 0.158094
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.141913: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.141913: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.144524: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.144524: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.141508: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.141508: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.141705: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.141705: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.148190: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.148190: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.144646: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.144646: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.144164: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.144164: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.144164: 64%|######3 | 7/11 [00:00<00:00, 58.37it/s]
iteration: 0007, losses: 0.143023: 64%|######3 | 7/11 [00:00<00:00, 58.37it/s]
iteration: 0007, losses: 0.143023: 64%|######3 | 7/11 [00:00<00:00, 58.37it/s]
iteration: 0008, losses: 0.141938: 64%|######3 | 7/11 [00:00<00:00, 58.37it/s]
iteration: 0008, losses: 0.141938: 64%|######3 | 7/11 [00:00<00:00, 58.37it/s]
iteration: 0009, losses: 0.145025: 64%|######3 | 7/11 [00:00<00:00, 58.37it/s]
iteration: 0009, losses: 0.145025: 64%|######3 | 7/11 [00:00<00:00, 58.37it/s]
iteration: 0010, losses: 0.127473: 64%|######3 | 7/11 [00:00<00:00, 58.37it/s]
iteration: 0010, losses: 0.127473: 64%|######3 | 7/11 [00:00<00:00, 58.37it/s]
iteration: 0010, losses: 0.127473: 100%|##########| 11/11 [00:00<00:00, 67.25it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 250.01it/s]
the running loss of the test set 0.157843
train loss: 0.156411 test loss: 0.157843
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.147378: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.147378: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.145639: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.145639: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.144745: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.144745: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.139797: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.139797: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.140621: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.140621: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.141734: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.141734: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.142557: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.142557: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.141992: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.141992: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.143174: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.143174: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.142192: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.142192: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.142192: 91%|######### | 10/11 [00:00<00:00, 88.31it/s]
iteration: 0010, losses: 0.132174: 91%|######### | 10/11 [00:00<00:00, 88.31it/s]
iteration: 0010, losses: 0.132174: 91%|######### | 10/11 [00:00<00:00, 88.31it/s]
iteration: 0010, losses: 0.132174: 100%|##########| 11/11 [00:00<00:00, 83.55it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 229.55it/s]
the running loss of the test set 0.157588
train loss: 0.156200 test loss: 0.157588
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.143383: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.143383: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.146103: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.146103: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.143289: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.143289: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.147881: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.147881: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.143895: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.143895: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.141415: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.141415: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.140268: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.140268: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143860: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143860: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141267: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141267: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.141337: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.141337: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.127480: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.127480: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.127480: 100%|##########| 11/11 [00:00<00:00, 105.30it/s]
iteration: 0010, losses: 0.127480: 100%|##########| 11/11 [00:00<00:00, 105.15it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 246.61it/s]
the running loss of the test set 0.157562
train loss: 0.156018 test loss: 0.157562
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.147237: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.147237: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.141396: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.141396: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.141676: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.141676: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142824: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142824: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.140577: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.140577: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142683: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142683: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.139639: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.139639: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140248: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140248: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.147079: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.147079: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.149413: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.149413: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.149413: 91%|######### | 10/11 [00:00<00:00, 91.77it/s]
iteration: 0010, losses: 0.127138: 91%|######### | 10/11 [00:00<00:00, 91.77it/s]
iteration: 0010, losses: 0.127138: 91%|######### | 10/11 [00:00<00:00, 91.77it/s]
iteration: 0010, losses: 0.127138: 100%|##########| 11/11 [00:00<00:00, 83.82it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 370.92it/s]
the running loss of the test set 0.157536
train loss: 0.155991 test loss: 0.157536
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.145318: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.145318: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.140145: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.140145: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142236: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142236: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.144938: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.144938: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.140586: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.140586: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.141173: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.141173: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.144383: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.144383: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.144293: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.144293: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.143258: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.143258: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.143258: 82%|########1 | 9/11 [00:00<00:00, 84.28it/s]
iteration: 0009, losses: 0.145977: 82%|########1 | 9/11 [00:00<00:00, 84.28it/s]
iteration: 0009, losses: 0.145977: 82%|########1 | 9/11 [00:00<00:00, 84.28it/s]
iteration: 0010, losses: 0.127566: 82%|########1 | 9/11 [00:00<00:00, 84.28it/s]
iteration: 0010, losses: 0.127566: 82%|########1 | 9/11 [00:00<00:00, 84.28it/s]
iteration: 0010, losses: 0.127566: 100%|##########| 11/11 [00:00<00:00, 81.77it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 245.97it/s]
the running loss of the test set 0.157510
train loss: 0.155987 test loss: 0.157510
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.144768: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.144768: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.141913: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.141913: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.144035: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.144035: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.145529: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.145529: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.143102: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.143102: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.145844: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.145844: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.142021: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.142021: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143140: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143140: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.142881: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.142881: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.142559: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.142559: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.142559: 91%|######### | 10/11 [00:00<00:00, 94.31it/s]
iteration: 0010, losses: 0.122551: 91%|######### | 10/11 [00:00<00:00, 94.31it/s]
iteration: 0010, losses: 0.122551: 91%|######### | 10/11 [00:00<00:00, 94.31it/s]
iteration: 0010, losses: 0.122551: 100%|##########| 11/11 [00:00<00:00, 96.34it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 246.17it/s]
the running loss of the test set 0.157483
train loss: 0.155834 test loss: 0.157483
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.143024: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.143024: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.138273: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.138273: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.145756: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.145756: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142786: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142786: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.143961: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.143961: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142979: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142979: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141483: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141483: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142995: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142995: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.148012: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.148012: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.148012: 82%|########1 | 9/11 [00:00<00:00, 86.14it/s]
iteration: 0009, losses: 0.145626: 82%|########1 | 9/11 [00:00<00:00, 86.14it/s]
iteration: 0009, losses: 0.145626: 82%|########1 | 9/11 [00:00<00:00, 86.14it/s]
iteration: 0010, losses: 0.123234: 82%|########1 | 9/11 [00:00<00:00, 86.14it/s]
iteration: 0010, losses: 0.123234: 82%|########1 | 9/11 [00:00<00:00, 86.14it/s]
iteration: 0010, losses: 0.123234: 100%|##########| 11/11 [00:00<00:00, 83.17it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 248.55it/s]
the running loss of the test set 0.157457
train loss: 0.155813 test loss: 0.157457
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.139115: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.139115: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.144880: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.144880: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.144192: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.144192: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.141833: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.141833: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.145795: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.145795: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.143777: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.143777: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.143108: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.143108: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140864: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140864: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.142754: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.142754: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.148567: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.148567: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.148567: 91%|######### | 10/11 [00:00<00:00, 97.67it/s]
iteration: 0010, losses: 0.123071: 91%|######### | 10/11 [00:00<00:00, 97.67it/s]
iteration: 0010, losses: 0.123071: 91%|######### | 10/11 [00:00<00:00, 97.67it/s]
iteration: 0010, losses: 0.123071: 100%|##########| 11/11 [00:00<00:00, 100.86it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 165.95it/s]
the running loss of the test set 0.157430
train loss: 0.155795 test loss: 0.157430
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142137: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142137: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.142284: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.142284: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142452: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142452: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.140749: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.140749: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.141132: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.141132: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.143820: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.143820: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.148505: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.148505: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143308: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143308: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141712: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141712: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.142344: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.142344: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.142344: 91%|######### | 10/11 [00:00<00:00, 87.64it/s]
iteration: 0010, losses: 0.130901: 91%|######### | 10/11 [00:00<00:00, 87.64it/s]
iteration: 0010, losses: 0.130901: 91%|######### | 10/11 [00:00<00:00, 87.64it/s]
iteration: 0010, losses: 0.130901: 100%|##########| 11/11 [00:00<00:00, 91.09it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 185.07it/s]
the running loss of the test set 0.157404
train loss: 0.155934 test loss: 0.157404
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.144270: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.144270: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.140812: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.140812: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.149060: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.149060: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.143140: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.143140: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.143202: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.143202: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.141218: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.141218: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141691: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141691: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.141274: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.141274: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141418: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141418: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141418: 82%|########1 | 9/11 [00:00<00:00, 83.18it/s]
iteration: 0009, losses: 0.144175: 82%|########1 | 9/11 [00:00<00:00, 83.18it/s]
iteration: 0009, losses: 0.144175: 82%|########1 | 9/11 [00:00<00:00, 83.18it/s]
iteration: 0010, losses: 0.127889: 82%|########1 | 9/11 [00:00<00:00, 83.18it/s]
iteration: 0010, losses: 0.127889: 82%|########1 | 9/11 [00:00<00:00, 83.18it/s]
iteration: 0010, losses: 0.127889: 100%|##########| 11/11 [00:00<00:00, 88.74it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 185.32it/s]
the running loss of the test set 0.157377
train loss: 0.155815 test loss: 0.157377
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.141350: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.141350: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.139173: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.139173: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.143447: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.143447: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.140739: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.140739: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.150064: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.150064: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.147746: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.147746: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.139140: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.139140: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.141641: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.141641: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.140457: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.140457: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.140457: 82%|########1 | 9/11 [00:00<00:00, 83.05it/s]
iteration: 0009, losses: 0.146541: 82%|########1 | 9/11 [00:00<00:00, 83.05it/s]
iteration: 0009, losses: 0.146541: 82%|########1 | 9/11 [00:00<00:00, 83.05it/s]
iteration: 0010, losses: 0.127911: 82%|########1 | 9/11 [00:00<00:00, 83.05it/s]
iteration: 0010, losses: 0.127911: 82%|########1 | 9/11 [00:00<00:00, 83.05it/s]
iteration: 0010, losses: 0.127911: 100%|##########| 11/11 [00:00<00:00, 89.81it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 359.12it/s]
the running loss of the test set 0.157350
train loss: 0.155821 test loss: 0.157350
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.143115: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.143115: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.139788: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.139788: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.150001: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.150001: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.139560: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.139560: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.142855: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.142855: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.141719: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.141719: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.143424: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.143424: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143881: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143881: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.142051: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.142051: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.142706: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.142706: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.142706: 91%|######### | 10/11 [00:00<00:00, 76.65it/s]
iteration: 0010, losses: 0.128959: 91%|######### | 10/11 [00:00<00:00, 76.65it/s]
iteration: 0010, losses: 0.128959: 91%|######### | 10/11 [00:00<00:00, 76.65it/s]
iteration: 0010, losses: 0.128959: 100%|##########| 11/11 [00:00<00:00, 77.02it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 177.95it/s]
the running loss of the test set 0.157323
train loss: 0.155806 test loss: 0.157323
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142669: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142669: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.142222: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.142222: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.141254: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.141254: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.141123: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.141123: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.143970: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.143970: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.143510: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.143510: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.144865: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.144865: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.146358: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.146358: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.144943: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.144943: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.139720: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.139720: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.139720: 91%|######### | 10/11 [00:00<00:00, 99.93it/s]
iteration: 0010, losses: 0.126320: 91%|######### | 10/11 [00:00<00:00, 99.93it/s]
iteration: 0010, losses: 0.126320: 91%|######### | 10/11 [00:00<00:00, 99.93it/s]
iteration: 0010, losses: 0.126320: 100%|##########| 11/11 [00:00<00:00, 92.05it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 232.63it/s]
the running loss of the test set 0.157296
train loss: 0.155695 test loss: 0.157296
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.145058: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.145058: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.147270: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.147270: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.143798: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.143798: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142620: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142620: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.141819: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.141819: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.141464: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.141464: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.140385: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.140385: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142403: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142403: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.140494: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.140494: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.144493: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.144493: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.128171: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.128171: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.128171: 100%|##########| 11/11 [00:00<00:00, 91.02it/s]
iteration: 0010, losses: 0.128171: 100%|##########| 11/11 [00:00<00:00, 90.90it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 193.80it/s]
the running loss of the test set 0.157269
train loss: 0.155797 test loss: 0.157269
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.141361: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.141361: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.147306: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.147306: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.145605: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.145605: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.138326: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.138326: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.142569: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.142569: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.143345: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.143345: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141496: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141496: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.147284: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.147284: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141042: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141042: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141042: 82%|########1 | 9/11 [00:00<00:00, 87.66it/s]
iteration: 0009, losses: 0.142751: 82%|########1 | 9/11 [00:00<00:00, 87.66it/s]
iteration: 0009, losses: 0.142751: 82%|########1 | 9/11 [00:00<00:00, 87.66it/s]
iteration: 0010, losses: 0.125298: 82%|########1 | 9/11 [00:00<00:00, 87.66it/s]
iteration: 0010, losses: 0.125298: 82%|########1 | 9/11 [00:00<00:00, 87.66it/s]
iteration: 0010, losses: 0.125298: 100%|##########| 11/11 [00:00<00:00, 93.10it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 352.22it/s]
the running loss of the test set 0.157243
train loss: 0.155638 test loss: 0.157243
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142946: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142946: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.147429: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.147429: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142000: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142000: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.145320: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.145320: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.141452: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.141452: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.138408: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.138408: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141306: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141306: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141306: 64%|######3 | 7/11 [00:00<00:00, 62.74it/s]
iteration: 0007, losses: 0.140816: 64%|######3 | 7/11 [00:00<00:00, 62.74it/s]
iteration: 0007, losses: 0.140816: 64%|######3 | 7/11 [00:00<00:00, 62.74it/s]
iteration: 0008, losses: 0.141681: 64%|######3 | 7/11 [00:00<00:00, 62.74it/s]
iteration: 0008, losses: 0.141681: 64%|######3 | 7/11 [00:00<00:00, 62.74it/s]
iteration: 0009, losses: 0.146418: 64%|######3 | 7/11 [00:00<00:00, 62.74it/s]
iteration: 0009, losses: 0.146418: 64%|######3 | 7/11 [00:00<00:00, 62.74it/s]
iteration: 0010, losses: 0.129712: 64%|######3 | 7/11 [00:00<00:00, 62.74it/s]
iteration: 0010, losses: 0.129712: 64%|######3 | 7/11 [00:00<00:00, 62.74it/s]
iteration: 0010, losses: 0.129712: 100%|##########| 11/11 [00:00<00:00, 72.79it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 353.36it/s]
the running loss of the test set 0.157216
train loss: 0.155749 test loss: 0.157216
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.140113: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.140113: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.145518: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.145518: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.143116: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.143116: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.141128: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.141128: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.144563: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.144563: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142566: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142566: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.147989: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.147989: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140076: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140076: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.140256: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.140256: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.140256: 82%|########1 | 9/11 [00:00<00:00, 84.21it/s]
iteration: 0009, losses: 0.142286: 82%|########1 | 9/11 [00:00<00:00, 84.21it/s]
iteration: 0009, losses: 0.142286: 82%|########1 | 9/11 [00:00<00:00, 84.21it/s]
iteration: 0010, losses: 0.129418: 82%|########1 | 9/11 [00:00<00:00, 84.21it/s]
iteration: 0010, losses: 0.129418: 82%|########1 | 9/11 [00:00<00:00, 84.21it/s]
iteration: 0010, losses: 0.129418: 100%|##########| 11/11 [00:00<00:00, 82.52it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 241.99it/s]
the running loss of the test set 0.157189
train loss: 0.155703 test loss: 0.157189
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142074: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142074: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.144425: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.144425: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.138665: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.138665: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142285: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142285: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.140164: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.140164: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.141125: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.141125: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.142856: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.142856: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.150260: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.150260: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.143149: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.143149: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.146083: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.146083: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.124486: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.124486: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.124486: 100%|##########| 11/11 [00:00<00:00, 104.68it/s]
iteration: 0010, losses: 0.124486: 100%|##########| 11/11 [00:00<00:00, 104.53it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 189.52it/s]
the running loss of the test set 0.157162
train loss: 0.155557 test loss: 0.157162
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142932: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142932: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.144964: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.144964: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142248: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142248: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.140342: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.140342: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.146435: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.146435: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.140027: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.140027: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.146719: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.146719: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140839: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140839: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.142103: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.142103: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.139608: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.139608: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.130473: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.130473: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.130473: 100%|##########| 11/11 [00:00<00:00, 121.45it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 149.17it/s]
the running loss of the test set 0.157136
train loss: 0.155669 test loss: 0.157136
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.143422: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.143422: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.141156: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.141156: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.145044: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.145044: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.139998: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.139998: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.143978: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.143978: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.141353: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.141353: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.139604: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.139604: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.148425: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.148425: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.148575: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.148575: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.140928: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.140928: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.140928: 91%|######### | 10/11 [00:00<00:00, 98.04it/s]
iteration: 0010, losses: 0.121693: 91%|######### | 10/11 [00:00<00:00, 98.04it/s]
iteration: 0010, losses: 0.121693: 91%|######### | 10/11 [00:00<00:00, 98.04it/s]
iteration: 0010, losses: 0.121693: 100%|##########| 11/11 [00:00<00:00, 101.64it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 184.77it/s]
the running loss of the test set 0.157109
train loss: 0.155418 test loss: 0.157109
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142237: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142237: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.145051: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.145051: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.145279: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.145279: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142286: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142286: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.140780: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.140780: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.141817: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.141817: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141281: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141281: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.145442: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.145442: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.145442: 73%|#######2 | 8/11 [00:00<00:00, 78.89it/s]
iteration: 0008, losses: 0.141567: 73%|#######2 | 8/11 [00:00<00:00, 78.89it/s]
iteration: 0008, losses: 0.141567: 73%|#######2 | 8/11 [00:00<00:00, 78.89it/s]
iteration: 0009, losses: 0.140439: 73%|#######2 | 8/11 [00:00<00:00, 78.89it/s]
iteration: 0009, losses: 0.140439: 73%|#######2 | 8/11 [00:00<00:00, 78.89it/s]
iteration: 0010, losses: 0.129301: 73%|#######2 | 8/11 [00:00<00:00, 78.89it/s]
iteration: 0010, losses: 0.129301: 73%|#######2 | 8/11 [00:00<00:00, 78.89it/s]
iteration: 0010, losses: 0.129301: 100%|##########| 11/11 [00:00<00:00, 73.29it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 241.86it/s]
the running loss of the test set 0.157082
train loss: 0.155548 test loss: 0.157082
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.136976: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.136976: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.144851: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.144851: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.140751: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.140751: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142886: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142886: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.144760: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.144760: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.145802: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.145802: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.145235: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.145235: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.145147: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.145147: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.144385: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.144385: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.140317: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.140317: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.122873: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.122873: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.122873: 100%|##########| 11/11 [00:00<00:00, 104.82it/s]
iteration: 0010, losses: 0.122873: 100%|##########| 11/11 [00:00<00:00, 104.52it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 184.75it/s]
the running loss of the test set 0.157056
train loss: 0.155398 test loss: 0.157056
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.140042: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.140042: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.140554: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.140554: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.145038: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.145038: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.141641: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.141641: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.143500: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.143500: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.144193: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.144193: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.148539: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.148539: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.144504: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.144504: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.144504: 73%|#######2 | 8/11 [00:00<00:00, 79.16it/s]
iteration: 0008, losses: 0.140508: 73%|#######2 | 8/11 [00:00<00:00, 79.16it/s]
iteration: 0008, losses: 0.140508: 73%|#######2 | 8/11 [00:00<00:00, 79.16it/s]
iteration: 0009, losses: 0.141966: 73%|#######2 | 8/11 [00:00<00:00, 79.16it/s]
iteration: 0009, losses: 0.141966: 73%|#######2 | 8/11 [00:00<00:00, 79.16it/s]
iteration: 0010, losses: 0.123264: 73%|#######2 | 8/11 [00:00<00:00, 79.16it/s]
iteration: 0010, losses: 0.123264: 73%|#######2 | 8/11 [00:00<00:00, 79.16it/s]
iteration: 0010, losses: 0.123264: 100%|##########| 11/11 [00:00<00:00, 90.42it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 250.57it/s]
the running loss of the test set 0.157029
train loss: 0.155375 test loss: 0.157029
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.144306: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.144306: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.145673: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.145673: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.140846: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.140846: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.141009: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.141009: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.141089: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.141089: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.144033: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.144033: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.142672: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.142672: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.145077: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.145077: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.146369: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.146369: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.138955: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.138955: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.123892: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.123892: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.123892: 100%|##########| 11/11 [00:00<00:00, 105.40it/s]
iteration: 0010, losses: 0.123892: 100%|##########| 11/11 [00:00<00:00, 105.25it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 187.83it/s]
the running loss of the test set 0.157003
train loss: 0.155392 test loss: 0.157003
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.140735: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.140735: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.151336: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.151336: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.139057: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.139057: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142421: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142421: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.142600: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.142600: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.143840: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.143840: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.143638: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.143638: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142886: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142886: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.140475: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.140475: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.140475: 82%|########1 | 9/11 [00:00<00:00, 85.09it/s]
iteration: 0009, losses: 0.139953: 82%|########1 | 9/11 [00:00<00:00, 85.09it/s]
iteration: 0009, losses: 0.139953: 82%|########1 | 9/11 [00:00<00:00, 85.09it/s]
iteration: 0010, losses: 0.127809: 82%|########1 | 9/11 [00:00<00:00, 85.09it/s]
iteration: 0010, losses: 0.127809: 82%|########1 | 9/11 [00:00<00:00, 85.09it/s]
iteration: 0010, losses: 0.127809: 100%|##########| 11/11 [00:00<00:00, 91.53it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 341.29it/s]
the running loss of the test set 0.156976
train loss: 0.155475 test loss: 0.156976
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142148: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142148: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.141929: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.141929: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.145284: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.145284: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.144251: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.144251: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.141307: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.141307: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.144695: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.144695: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.142861: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.142861: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.142861: 64%|######3 | 7/11 [00:00<00:00, 68.34it/s]
iteration: 0007, losses: 0.140356: 64%|######3 | 7/11 [00:00<00:00, 68.34it/s]
iteration: 0007, losses: 0.140356: 64%|######3 | 7/11 [00:00<00:00, 68.34it/s]
iteration: 0008, losses: 0.143422: 64%|######3 | 7/11 [00:00<00:00, 68.34it/s]
iteration: 0008, losses: 0.143422: 64%|######3 | 7/11 [00:00<00:00, 68.34it/s]
iteration: 0009, losses: 0.141693: 64%|######3 | 7/11 [00:00<00:00, 68.34it/s]
iteration: 0009, losses: 0.141693: 64%|######3 | 7/11 [00:00<00:00, 68.34it/s]
iteration: 0010, losses: 0.125957: 64%|######3 | 7/11 [00:00<00:00, 68.34it/s]
iteration: 0010, losses: 0.125957: 64%|######3 | 7/11 [00:00<00:00, 68.34it/s]
iteration: 0010, losses: 0.125957: 100%|##########| 11/11 [00:00<00:00, 67.17it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 181.94it/s]
the running loss of the test set 0.156950
train loss: 0.155390 test loss: 0.156950
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.143832: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.143832: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.139270: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.139270: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142078: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142078: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.145323: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.145323: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.143540: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.143540: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.146326: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.146326: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.140024: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.140024: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142107: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142107: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141638: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141638: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141638: 82%|########1 | 9/11 [00:00<00:00, 84.80it/s]
iteration: 0009, losses: 0.142537: 82%|########1 | 9/11 [00:00<00:00, 84.80it/s]
iteration: 0009, losses: 0.142537: 82%|########1 | 9/11 [00:00<00:00, 84.80it/s]
iteration: 0010, losses: 0.127282: 82%|########1 | 9/11 [00:00<00:00, 84.80it/s]
iteration: 0010, losses: 0.127282: 82%|########1 | 9/11 [00:00<00:00, 84.80it/s]
iteration: 0010, losses: 0.127282: 100%|##########| 11/11 [00:00<00:00, 91.82it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 187.74it/s]
the running loss of the test set 0.156924
train loss: 0.155396 test loss: 0.156924
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.143298: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.143298: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.142569: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.142569: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.139687: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.139687: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.145907: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.145907: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.143397: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.143397: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.140204: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.140204: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.139755: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.139755: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142501: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142501: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.147777: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.147777: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.144948: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.144948: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.144948: 91%|######### | 10/11 [00:00<00:00, 99.79it/s]
iteration: 0010, losses: 0.122314: 91%|######### | 10/11 [00:00<00:00, 99.79it/s]
iteration: 0010, losses: 0.122314: 91%|######### | 10/11 [00:00<00:00, 99.79it/s]
iteration: 0010, losses: 0.122314: 100%|##########| 11/11 [00:00<00:00, 90.05it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 361.01it/s]
the running loss of the test set 0.156898
train loss: 0.155236 test loss: 0.156898
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.138391: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.138391: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.143908: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.143908: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.145271: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.145271: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.144318: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.144318: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.144876: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.144876: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.139595: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.139595: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.143172: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.143172: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.139433: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.139433: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.144985: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.144985: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.141782: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.141782: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.141782: 91%|######### | 10/11 [00:00<00:00, 90.38it/s]
iteration: 0010, losses: 0.128100: 91%|######### | 10/11 [00:00<00:00, 90.38it/s]
iteration: 0010, losses: 0.128100: 91%|######### | 10/11 [00:00<00:00, 90.38it/s]
iteration: 0010, losses: 0.128100: 100%|##########| 11/11 [00:00<00:00, 93.19it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 184.12it/s]
the running loss of the test set 0.156872
train loss: 0.155383 test loss: 0.156872
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.140359: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.140359: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.146264: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.146264: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.139848: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.139848: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.146916: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.146916: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.140451: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.140451: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.147816: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.147816: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141634: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141634: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140868: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140868: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140868: 73%|#######2 | 8/11 [00:00<00:00, 77.73it/s]
iteration: 0008, losses: 0.140791: 73%|#######2 | 8/11 [00:00<00:00, 77.73it/s]
iteration: 0008, losses: 0.140791: 73%|#######2 | 8/11 [00:00<00:00, 77.73it/s]
iteration: 0009, losses: 0.140614: 73%|#######2 | 8/11 [00:00<00:00, 77.73it/s]
iteration: 0009, losses: 0.140614: 73%|#######2 | 8/11 [00:00<00:00, 77.73it/s]
iteration: 0010, losses: 0.127336: 73%|#######2 | 8/11 [00:00<00:00, 77.73it/s]
iteration: 0010, losses: 0.127336: 73%|#######2 | 8/11 [00:00<00:00, 77.73it/s]
iteration: 0010, losses: 0.127336: 100%|##########| 11/11 [00:00<00:00, 89.48it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 237.04it/s]
the running loss of the test set 0.156846
train loss: 0.155290 test loss: 0.156846
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142313: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142313: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.142720: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.142720: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.147243: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.147243: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.146658: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.146658: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.145578: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.145578: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142515: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142515: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.139686: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.139686: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140816: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140816: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.139272: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.139272: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.141411: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.141411: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.141411: 91%|######### | 10/11 [00:00<00:00, 92.49it/s]
iteration: 0010, losses: 0.123899: 91%|######### | 10/11 [00:00<00:00, 92.49it/s]
iteration: 0010, losses: 0.123899: 91%|######### | 10/11 [00:00<00:00, 92.49it/s]
iteration: 0010, losses: 0.123899: 100%|##########| 11/11 [00:00<00:00, 84.40it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 367.79it/s]
the running loss of the test set 0.156820
train loss: 0.155211 test loss: 0.156820
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.144096: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.144096: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.140119: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.140119: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.141705: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.141705: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.139409: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.139409: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.144709: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.144709: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.146477: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.146477: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.146355: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.146355: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140837: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140837: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.144370: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.144370: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.144370: 82%|########1 | 9/11 [00:00<00:00, 85.23it/s]
iteration: 0009, losses: 0.140524: 82%|########1 | 9/11 [00:00<00:00, 85.23it/s]
iteration: 0009, losses: 0.140524: 82%|########1 | 9/11 [00:00<00:00, 85.23it/s]
iteration: 0010, losses: 0.123116: 82%|########1 | 9/11 [00:00<00:00, 85.23it/s]
iteration: 0010, losses: 0.123116: 82%|########1 | 9/11 [00:00<00:00, 85.23it/s]
iteration: 0010, losses: 0.123116: 100%|##########| 11/11 [00:00<00:00, 90.23it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 173.78it/s]
the running loss of the test set 0.156794
train loss: 0.155172 test loss: 0.156794
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.139456: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.139456: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.140512: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.140512: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.145245: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.145245: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142591: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142591: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.141521: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.141521: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142258: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142258: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.143283: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.143283: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.146888: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.146888: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.142430: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.142430: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.144698: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.144698: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.144698: 91%|######### | 10/11 [00:00<00:00, 89.89it/s]
iteration: 0010, losses: 0.122146: 91%|######### | 10/11 [00:00<00:00, 89.89it/s]
iteration: 0010, losses: 0.122146: 91%|######### | 10/11 [00:00<00:00, 89.89it/s]
iteration: 0010, losses: 0.122146: 100%|##########| 11/11 [00:00<00:00, 92.38it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 182.04it/s]
the running loss of the test set 0.156768
train loss: 0.155103 test loss: 0.156768
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.146615: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.146615: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.144148: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.144148: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142476: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142476: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.139545: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.139545: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.139770: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.139770: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142781: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142781: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.144287: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.144287: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.138637: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.138637: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.142690: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.142690: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.143906: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.143906: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.127747: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.127747: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.127747: 100%|##########| 11/11 [00:00<00:00, 105.93it/s]
iteration: 0010, losses: 0.127747: 100%|##########| 11/11 [00:00<00:00, 105.46it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 242.77it/s]
the running loss of the test set 0.156742
train loss: 0.155260 test loss: 0.156742
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142362: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142362: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.141604: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.141604: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.141521: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.141521: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.145272: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.145272: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.144773: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.144773: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142376: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142376: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.139677: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.139677: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.139017: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.139017: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.142877: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.142877: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.148054: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.148054: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.148054: 91%|######### | 10/11 [00:00<00:00, 90.59it/s]
iteration: 0010, losses: 0.123064: 91%|######### | 10/11 [00:00<00:00, 90.59it/s]
iteration: 0010, losses: 0.123064: 91%|######### | 10/11 [00:00<00:00, 90.59it/s]
iteration: 0010, losses: 0.123064: 100%|##########| 11/11 [00:00<00:00, 84.10it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 240.94it/s]
the running loss of the test set 0.156717
train loss: 0.155060 test loss: 0.156717
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.140234: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.140234: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.145463: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.145463: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.143963: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.143963: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142180: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142180: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.146149: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.146149: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142469: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142469: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.142158: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.142158: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143977: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143977: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.139005: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.139005: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.139182: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.139182: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.139182: 91%|######### | 10/11 [00:00<00:00, 90.54it/s]
iteration: 0010, losses: 0.126859: 91%|######### | 10/11 [00:00<00:00, 90.54it/s]
iteration: 0010, losses: 0.126859: 91%|######### | 10/11 [00:00<00:00, 90.54it/s]
iteration: 0010, losses: 0.126859: 100%|##########| 11/11 [00:00<00:00, 94.14it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 181.07it/s]
the running loss of the test set 0.156691
train loss: 0.155164 test loss: 0.156691
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.144552: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.144552: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.138843: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.138843: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.140744: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.140744: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.145039: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.145039: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.142718: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.142718: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.140338: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.140338: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.140643: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.140643: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.144302: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.144302: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.143553: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.143553: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.146815: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.146815: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.146815: 91%|######### | 10/11 [00:00<00:00, 99.96it/s]
iteration: 0010, losses: 0.123069: 91%|######### | 10/11 [00:00<00:00, 99.96it/s]
iteration: 0010, losses: 0.123069: 91%|######### | 10/11 [00:00<00:00, 99.96it/s]
iteration: 0010, losses: 0.123069: 100%|##########| 11/11 [00:00<00:00, 102.27it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 244.39it/s]
the running loss of the test set 0.156665
train loss: 0.155062 test loss: 0.156665
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142831: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142831: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.141221: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.141221: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.139458: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.139458: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142781: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142781: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.145877: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.145877: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.139826: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.139826: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.151069: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.151069: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.139900: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.139900: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141930: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141930: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141930: 82%|########1 | 9/11 [00:00<00:00, 74.81it/s]
iteration: 0009, losses: 0.139799: 82%|########1 | 9/11 [00:00<00:00, 74.81it/s]
iteration: 0009, losses: 0.139799: 82%|########1 | 9/11 [00:00<00:00, 74.81it/s]
iteration: 0010, losses: 0.126165: 82%|########1 | 9/11 [00:00<00:00, 74.81it/s]
iteration: 0010, losses: 0.126165: 82%|########1 | 9/11 [00:00<00:00, 74.81it/s]
iteration: 0010, losses: 0.126165: 100%|##########| 11/11 [00:00<00:00, 81.85it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 186.77it/s]
the running loss of the test set 0.156640
train loss: 0.155086 test loss: 0.156640
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.138191: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.138191: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.144470: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.144470: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.147397: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.147397: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142734: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142734: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.144386: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.144386: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.140965: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.140965: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.143242: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.143242: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140470: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140470: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141175: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141175: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.139046: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.139046: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.139046: 91%|######### | 10/11 [00:00<00:00, 99.34it/s]
iteration: 0010, losses: 0.129284: 91%|######### | 10/11 [00:00<00:00, 99.34it/s]
iteration: 0010, losses: 0.129284: 91%|######### | 10/11 [00:00<00:00, 99.34it/s]
iteration: 0010, losses: 0.129284: 100%|##########| 11/11 [00:00<00:00, 90.94it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 345.59it/s]
the running loss of the test set 0.156615
train loss: 0.155136 test loss: 0.156615
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.137193: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.137193: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.142340: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.142340: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.141623: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.141623: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.143278: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.143278: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.146228: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.146228: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.140404: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.140404: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.139673: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.139673: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143321: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143321: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.148046: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.148046: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.148046: 82%|########1 | 9/11 [00:00<00:00, 87.22it/s]
iteration: 0009, losses: 0.143199: 82%|########1 | 9/11 [00:00<00:00, 87.22it/s]
iteration: 0009, losses: 0.143199: 82%|########1 | 9/11 [00:00<00:00, 87.22it/s]
iteration: 0010, losses: 0.124538: 82%|########1 | 9/11 [00:00<00:00, 87.22it/s]
iteration: 0010, losses: 0.124538: 82%|########1 | 9/11 [00:00<00:00, 87.22it/s]
iteration: 0010, losses: 0.124538: 100%|##########| 11/11 [00:00<00:00, 94.82it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 182.03it/s]
the running loss of the test set 0.156589
train loss: 0.154984 test loss: 0.156589
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.145031: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.145031: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.141680: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.141680: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.143476: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.143476: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.141484: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.141484: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.140400: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.140400: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.144918: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.144918: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.142691: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.142691: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.139210: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.139210: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.142612: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.142612: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.139167: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.139167: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.139167: 91%|######### | 10/11 [00:00<00:00, 87.51it/s]
iteration: 0010, losses: 0.128975: 91%|######### | 10/11 [00:00<00:00, 87.51it/s]
iteration: 0010, losses: 0.128975: 91%|######### | 10/11 [00:00<00:00, 87.51it/s]
iteration: 0010, losses: 0.128975: 100%|##########| 11/11 [00:00<00:00, 90.61it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 244.43it/s]
the running loss of the test set 0.156564
train loss: 0.154964 test loss: 0.156564
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142268: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142268: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.140772: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.140772: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.141125: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.141125: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142406: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.142406: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.143901: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.143901: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142930: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.142930: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141440: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141440: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.139231: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.139231: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.148302: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.148302: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.148302: 82%|########1 | 9/11 [00:00<00:00, 76.47it/s]
iteration: 0009, losses: 0.141702: 82%|########1 | 9/11 [00:00<00:00, 76.47it/s]
iteration: 0009, losses: 0.141702: 82%|########1 | 9/11 [00:00<00:00, 76.47it/s]
iteration: 0010, losses: 0.125565: 82%|########1 | 9/11 [00:00<00:00, 76.47it/s]
iteration: 0010, losses: 0.125565: 82%|########1 | 9/11 [00:00<00:00, 76.47it/s]
iteration: 0010, losses: 0.125565: 100%|##########| 11/11 [00:00<00:00, 80.81it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 247.79it/s]
the running loss of the test set 0.156539
train loss: 0.154964 test loss: 0.156539
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.140121: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.140121: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.144651: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.144651: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142276: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142276: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.139870: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.139870: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.139926: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.139926: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.141715: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.141715: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141871: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141871: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142738: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142738: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141302: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141302: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.149852: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.149852: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.125307: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.125307: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.125307: 100%|##########| 11/11 [00:00<00:00, 101.21it/s]
iteration: 0010, losses: 0.125307: 100%|##########| 11/11 [00:00<00:00, 101.04it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 196.49it/s]
the running loss of the test set 0.156514
train loss: 0.154963 test loss: 0.156514
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.141939: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.141939: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.143197: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.143197: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.143704: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.143704: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.138993: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.138993: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.147618: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.147618: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.144145: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.144145: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141611: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141611: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143200: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143200: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143200: 73%|#######2 | 8/11 [00:00<00:00, 75.48it/s]
iteration: 0008, losses: 0.141074: 73%|#######2 | 8/11 [00:00<00:00, 75.48it/s]
iteration: 0008, losses: 0.141074: 73%|#######2 | 8/11 [00:00<00:00, 75.48it/s]
iteration: 0009, losses: 0.138968: 73%|#######2 | 8/11 [00:00<00:00, 75.48it/s]
iteration: 0009, losses: 0.138968: 73%|#######2 | 8/11 [00:00<00:00, 75.48it/s]
iteration: 0010, losses: 0.124718: 73%|#######2 | 8/11 [00:00<00:00, 75.48it/s]
iteration: 0010, losses: 0.124718: 73%|#######2 | 8/11 [00:00<00:00, 75.48it/s]
iteration: 0010, losses: 0.124718: 100%|##########| 11/11 [00:00<00:00, 87.08it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 173.40it/s]
the running loss of the test set 0.156489
train loss: 0.154917 test loss: 0.156489
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142792: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142792: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.140555: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.140555: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.145629: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.145629: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.143636: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.143636: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.141360: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.141360: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.140241: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.140241: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141294: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141294: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143366: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.143366: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.142842: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.142842: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.140052: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.140052: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.140052: 91%|######### | 10/11 [00:00<00:00, 91.26it/s]
iteration: 0010, losses: 0.128252: 91%|######### | 10/11 [00:00<00:00, 91.26it/s]
iteration: 0010, losses: 0.128252: 91%|######### | 10/11 [00:00<00:00, 91.26it/s]
iteration: 0010, losses: 0.128252: 100%|##########| 11/11 [00:00<00:00, 94.23it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 189.15it/s]
the running loss of the test set 0.156464
train loss: 0.155002 test loss: 0.156464
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.143200: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.143200: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.146353: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.146353: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.140691: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.140691: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.150849: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.150849: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.140756: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.140756: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.140687: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.140687: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.137179: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.137179: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.141137: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.141137: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141138: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141138: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.141393: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.141393: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.125297: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.125297: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.125297: 100%|##########| 11/11 [00:00<00:00, 117.14it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 183.31it/s]
the running loss of the test set 0.156439
train loss: 0.154868 test loss: 0.156439
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.145751: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.145751: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.144352: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.144352: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.141002: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.141002: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.143451: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.143451: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.141827: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.141827: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.140554: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.140554: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.138917: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.138917: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140784: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140784: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141594: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141594: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141594: 82%|########1 | 9/11 [00:00<00:00, 87.65it/s]
iteration: 0009, losses: 0.143491: 82%|########1 | 9/11 [00:00<00:00, 87.65it/s]
iteration: 0009, losses: 0.143491: 82%|########1 | 9/11 [00:00<00:00, 87.65it/s]
iteration: 0010, losses: 0.127417: 82%|########1 | 9/11 [00:00<00:00, 87.65it/s]
iteration: 0010, losses: 0.127417: 82%|########1 | 9/11 [00:00<00:00, 87.65it/s]
iteration: 0010, losses: 0.127417: 100%|##########| 11/11 [00:00<00:00, 84.50it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 256.29it/s]
the running loss of the test set 0.156414
train loss: 0.154914 test loss: 0.156414
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142817: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.142817: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.141356: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.141356: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.141511: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.141511: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.140144: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.140144: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.143129: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.143129: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.145268: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.145268: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.139510: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.139510: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.139877: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.139877: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.140989: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.140989: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.148058: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.148058: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.125760: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.125760: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.125760: 100%|##########| 11/11 [00:00<00:00, 109.95it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 247.89it/s]
the running loss of the test set 0.156389
train loss: 0.154842 test loss: 0.156389
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.149916: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.149916: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.140627: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.140627: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142532: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142532: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.139766: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.139766: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.144058: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.144058: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.139028: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.139028: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141583: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.141583: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.138141: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.138141: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.138141: 73%|#######2 | 8/11 [00:00<00:00, 71.91it/s]
iteration: 0008, losses: 0.145502: 73%|#######2 | 8/11 [00:00<00:00, 71.91it/s]
iteration: 0008, losses: 0.145502: 73%|#######2 | 8/11 [00:00<00:00, 71.91it/s]
iteration: 0009, losses: 0.142049: 73%|#######2 | 8/11 [00:00<00:00, 71.91it/s]
iteration: 0009, losses: 0.142049: 73%|#######2 | 8/11 [00:00<00:00, 71.91it/s]
iteration: 0010, losses: 0.124988: 73%|#######2 | 8/11 [00:00<00:00, 71.91it/s]
iteration: 0010, losses: 0.124988: 73%|#######2 | 8/11 [00:00<00:00, 71.91it/s]
iteration: 0010, losses: 0.124988: 100%|##########| 11/11 [00:00<00:00, 82.95it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 368.08it/s]
the running loss of the test set 0.156365
train loss: 0.154819 test loss: 0.156365
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.141000: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.141000: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.143552: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.143552: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.141375: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.141375: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.139040: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.139040: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.138759: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.138759: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.144926: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.144926: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.140041: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.140041: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142207: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142207: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.143647: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.143647: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.143647: 82%|########1 | 9/11 [00:00<00:00, 75.78it/s]
iteration: 0009, losses: 0.145143: 82%|########1 | 9/11 [00:00<00:00, 75.78it/s]
iteration: 0009, losses: 0.145143: 82%|########1 | 9/11 [00:00<00:00, 75.78it/s]
iteration: 0010, losses: 0.129116: 82%|########1 | 9/11 [00:00<00:00, 75.78it/s]
iteration: 0010, losses: 0.129116: 82%|########1 | 9/11 [00:00<00:00, 75.78it/s]
iteration: 0010, losses: 0.129116: 100%|##########| 11/11 [00:00<00:00, 82.83it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 185.74it/s]
the running loss of the test set 0.156340
train loss: 0.154881 test loss: 0.156340
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.149670: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.149670: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.142756: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.142756: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142054: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142054: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.140090: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.140090: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.138371: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.138371: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.143509: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.143509: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.140132: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.140132: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142689: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142689: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.139823: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.139823: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.143246: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.143246: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.125144: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.125144: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0010, losses: 0.125144: 100%|##########| 11/11 [00:00<00:00, 107.01it/s]
iteration: 0010, losses: 0.125144: 100%|##########| 11/11 [00:00<00:00, 106.83it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 252.25it/s]
the running loss of the test set 0.156315
train loss: 0.154748 test loss: 0.156315
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.141241: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.141241: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.143637: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.143637: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.146092: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.146092: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.139852: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.139852: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.142026: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.142026: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.139353: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.139353: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.144869: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.144869: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140223: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.140223: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.143494: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.143494: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.140147: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.140147: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.140147: 91%|######### | 10/11 [00:00<00:00, 92.52it/s]
iteration: 0010, losses: 0.126290: 91%|######### | 10/11 [00:00<00:00, 92.52it/s]
iteration: 0010, losses: 0.126290: 91%|######### | 10/11 [00:00<00:00, 92.52it/s]
iteration: 0010, losses: 0.126290: 100%|##########| 11/11 [00:00<00:00, 85.71it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 352.86it/s]
the running loss of the test set 0.156291
train loss: 0.154722 test loss: 0.156291
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.144045: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.144045: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.143631: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.143631: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142359: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.142359: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.139266: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.139266: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.140633: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.140633: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.137678: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.137678: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.146234: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.146234: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.145070: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.145070: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141203: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141203: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.141203: 82%|########1 | 9/11 [00:00<00:00, 87.96it/s]
iteration: 0009, losses: 0.140986: 82%|########1 | 9/11 [00:00<00:00, 87.96it/s]
iteration: 0009, losses: 0.140986: 82%|########1 | 9/11 [00:00<00:00, 87.96it/s]
iteration: 0010, losses: 0.125854: 82%|########1 | 9/11 [00:00<00:00, 87.96it/s]
iteration: 0010, losses: 0.125854: 82%|########1 | 9/11 [00:00<00:00, 87.96it/s]
iteration: 0010, losses: 0.125854: 100%|##########| 11/11 [00:00<00:00, 94.67it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 187.32it/s]
the running loss of the test set 0.156266
train loss: 0.154696 test loss: 0.156266
0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.140091: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0000, losses: 0.140091: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.142038: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0001, losses: 0.142038: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.140866: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0002, losses: 0.140866: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.141818: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0003, losses: 0.141818: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.139732: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0004, losses: 0.139732: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.147713: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0005, losses: 0.147713: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.146406: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0006, losses: 0.146406: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142732: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0007, losses: 0.142732: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.140817: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0008, losses: 0.140817: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.141663: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.141663: 0%| | 0/11 [00:00<?, ?it/s]
iteration: 0009, losses: 0.141663: 91%|######### | 10/11 [00:00<00:00, 94.50it/s]
iteration: 0010, losses: 0.122177: 91%|######### | 10/11 [00:00<00:00, 94.50it/s]
iteration: 0010, losses: 0.122177: 91%|######### | 10/11 [00:00<00:00, 94.50it/s]
iteration: 0010, losses: 0.122177: 100%|##########| 11/11 [00:00<00:00, 97.67it/s]
0%| | 0/11 [00:00<?, ?it/s]
100%|##########| 11/11 [00:00<00:00, 248.67it/s]
the running loss of the test set 0.156242
train loss: 0.154605 test loss: 0.156242
And that’s it. We’are done with our IMUCorrecter tutorials. Thanks for reading.
Total running time of the script: ( 0 minutes 17.385 seconds)