サービス(トラブル対応)

On-site Storage Device Fix w/Richard

    Around the end of March 2020, I had my first OJT (On the Job Training) with my co-worker Richard in Tokyo. This on-site was at a scientific agency; a storage device task which required us to replace a faulty device with a new refurbished one. We arrived 30 minutes early, and once we had met with the local site contact and another engineer, we entered the building. The storage device we would be servicing was in a locked data room. For security purposes, we were only permitted to bring paper documents, writing utensils and basic tools for use on the device. In this data room, even shoes were prohibited, so we borrowed slippers to use during the task.

    Because this was my first time in a data room, I was nervous but excited. We were surrounded by racks after racks of storage and network devices, and the sound of the fans ventilating the air in the room was quite loud and required getting used to. The local site contact guided us to the rack which contained the storage device and unlocked it for us. Richard then began inspecting the device, checking for error tickets and general troubleshooting.

    After inspecting the device for a while, we decided to begin the device replacement. Following Richard’s instructions, I assisted in removing the power supplies, the network cables and screws that held the control module in place. The task of removing the control module was not possible by one person due to it being quite large and heavy; luckily, I was in attendance for OJT. Richard and I held one end each, and removed the control module from the rack, and replaced the old device with the new one. After reinstalling the control module to the rack, we noticed that the same errors were occurring, and after consulting with the remote team, the conclusion was that the new device that arrived in the morning was DOA (Dead On Arrival), or there was an issue with another component. At that point, there was not much else we could do, so the task was re-scheduled for a later day when the new parts would arrive.

    Two days later, the new parts had arrived, and we went on-site again. We entered the data room again and began the task. Once again, we removed all the connected cables and screws, then removed the control module from the rack. Despite replacing the device again with a brand new one, we had a new error where the device was not functioning as intended. At this point, I was convinced that the problem was another component, so I suggested that the next thing we tried was replacing it with a new one. However, Richard suggested that the other component would be quite difficult to replace, so he had a better idea to try first.

    We removed the control module from the rack and further inspected the device. Richard noticed that it was not seated in the control module properly. The wheels of the device were not properly aligned to the rails. He then carefully aligned the wheels of the device into the control module, and we reinstalled it back into the rack. After reconnecting the cables and screws, we booted up the device, and to my surprise the robot began functioning properly. The local contact confirmed connection through the web GUI and at that point, all the problems were resolved, and the task was completed.

    As I had completed the training before for this storage device, I was familiar with it, however seeing it in person and having hands-on experience was great. It made me realize how much more can be learned by OJT in addition to completing training courses. During my first two on-sites I learned many things, but most of them were not related to the technical side of things. The most important things I learned were by watching Richard working. The way in which he works was very careful and organized: every cable he removed was labeled to ensure they would be reconnected correctly; he took note of the serial numbers, product numbers and the errors which were faced when using them. Even his brilliant idea to check and re-seat the device in the end saved us from unnecessarily replacing the other component, which could have created additional problems. I think these were the most valuable things I learnt and would make sure I remember for future on-site tasks.