System troubleshooting guide
Infraestructure
First, it is always necessary to ensure that the required infrastructure is working properly, namely connection to power grid and to the central servers (network).
The following chapters describe troubleshooting steps, considering that the required infrastructure is working properly.
Devices
Device’s connectivity
Access the equipment, via SSH or locally, and check the network parameters.
IP / netmask
To check the IP and netmask addresses:
ip addr
1: lo: <LOOPBACK, UP, LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
link / loopback 00: 00: 00: 00: 00: 00 brd 00: 00: 00: 00: 00: 00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 :: 1/128 scope host
valid_lft forever preferred_lft forever
2: enp1s0: <BROADCAST, MULTICAST, UP, LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link / ether 24: 1c: 04: 08: c0: 15 brd ff: ff: ff: ff: ff: ff
inet 192.168.154.180/24 brd 192.168.154.255 global scope enp1s0
valid_lft forever preferred_lft forever
inet6 fe80 :: dd28: 2dd2: 821f: ba37 / 64 scope link
valid_lft forever preferred_lft forever
3: wlan0: <NO-CARRIER, BROADCAST, MULTICAST, UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
link / ether c0: 21: 0d: c2: d3: 45 brd ff: ff: ff: ff: ff: ff
To check the gateway:
ip route
$ ip route
default via 192.168.154.1 dev enp1s0 proto static metric 100
192.168.154.0/24 dev enp1s0 proto kernel scope link src 192.168.154.180 metric 100
The ping to the load balancer server will show if the device's network settings are correct:
ping bloomserverhost -c 1 -w 5
PING bloomserverhost (10.18.218.4) 56 (84) bytes of data.
64 bytes from 10.18.218.4 (10.18.218.4): icmp_seq = 1 ttl = 252 time = 1.51 ms
--- bloomserverhost ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min / avg / max / mdev = 1.510 / 1.510 / 1.510 / 0.000 ms
bloomserverhost is the server local IP our DNS
This command tries to do one ping to the specified address and waits a maximum of 5 seconds for a response.
In the example, the ping returned successfully, which means that the device has access to the load balancer server. Other results could be:
$ ping bloomserverhost -c 1 -w 5
PING bloomserverhost (10.18.218.4) 56 (84) bytes of data.
--- bloomserverhost ping statistics ---
6 packets transmitted, 0 received, 100% packet loss, time 4999ms
In this case, the ping can resolve the address, but cannot access the load balancer server. This means that the DNS settings are right, but one of the following problems may be occurring:
- incorrect gateway configuration;
- a problem with the network infrastructure;
- a problem in the load balancer.
$ ping bloomserverhost
ping: bloomserverhost: Name or service not known
DNS
If the DNS is well configured, the command dig should look something like:
dig bloomserverhost
; << >> DiG 9.9.4-RedHat-9.9.4-61.el7 << >> bloomserverhost
;; global options: + cmd
;; Got answer:
;; - >> HEADER << - opcode: QUERY, status: NOERROR, id: 18082
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags :; udp: 4000
;; QUESTION SECTION:
; bloomserverhost. IN A
;; ANSWER SECTION:
bloomserverhost. 0 IN A 10.18.218.4
;; Query time: 14 msec
;;SERVER: 10.131.53.129 # 53 (10,131.53.129)
;; WHEN: Mon Mar 11 12:02:54 GMT 2019
;; MSG SIZE rcvd: 58
The ‘ANSWER SECTION’ shows the IP address that the DNS server responded to bloomserverhost. In the ‘SERVER’ section, within parentheses, is the IP address of the DNS server used.
If you are unable to resolve, it will not return the ‘ANSWER SECTION’:
$ dig bloomserverhost
; << >> DiG 9.9.4-RedHat-9.9.4-61.el7 << >> bloomserverhost
;; global options: + cmd
;; Got answer:
;; - >> HEADER << - opcode: QUERY, status: NOERROR, id: 18082
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags :; udp: 4000
;; QUESTION SECTION:
; bloomserverhost. IN A
;; Query time: 14 msec
;;SERVER: 10.131.53.129 # 53 (10,131.53.129)
;; WHEN: Mon Mar 11 12:03:31 GMT 2019
;; MSG SIZE rcvd: 58
In this case, it means that the DNS is well configured, or at least it is a DNS server responding. If it is unable to reach the DNS server, it will give the error:
$ dig bloomserverhost
; << >> DiG 9.9.4-RedHat-9.9.4-61.el7 << >> bloomserverhost
;; global options: + cmd
;;connection timed out; no servers could be reached
It is possible to test a DNS server with the command dig:
dig bloomserverhost @ 10.131.53.129
; << >> DiG 9.9.4-RedHat-9.9.4-61.el7 << >> bloomserverhost @ 10.131.53.129
;; global options: + cmd
;; Got answer:
;; - >> HEADER << - opcode: QUERY, status: NOERROR, id: 18082
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags :; udp: 4000
;; QUESTION SECTION:
; bloomserverhost. IN A
;; ANSWER SECTION:
bloomserverhost. 0 IN A 10.18.218.4
;; Query time: 14 msec
;; SERVER: 10.131.53.129 # 53 (10,131.53.129)
;; WHEN: Thu Mar 14 12:02:54 GMT 2019
;; MSG SIZE rcvd: 58
This way, the command makes the dig use the DNS server 10.131.53.129, regardless of which server is configured on the system. If it works by manually enter the DNS, change it to the correct one in the backoffice.
Ticket dispenser’s touchscreen detection
With the command xinput, it is possible to check if the touchscreen of a ticket dispenser is being detected. In the following example, it is detected under the name ILITEK Multi-Touch-V3000.
DISPLAY =: 0 xinput
⎡ Virtual core pointer id = 2 [master pointer (3)]
⎜ ↳ Virtual core XTEST pointer id = 4 [slave pointer (2)]
⎜ ↳ ILITEK Multi-Touch-V3000 id = 6 [slave pointer (2)]
⎜ ↳ ILITEK Multi-Touch-V3000 id = 7 [slave pointer (2)]
⎣ Virtual core keyboard id = 3 [master keyboard (2)]
↳ Virtual core XTEST keyboard id = 5 [slave keyboard (3)]
If it is not detected, and assuming that the required cables have been verified, it must be a hardware problem.
Ticket dispenser’s printer detection
lsusb
Bus 001 Device 005: ID 1051: 1000
Bus 001 Device 004: ID 222a: 0001
Bus 001 Device 006: ID 0424: 7800 Standard Microsystems Corp.
Bus 001 Device 003: ID 0424: 2514 Standard Microsystems Corp. USB 2.0 Hub
Bus 001 Device 002: ID 0424: 2514 Standard Microsystems Corp. USB 2.0 Hub
Bus 001 Device 001: ID 1d6b: 0002 Linux Foundation 2.0 root hub
The command lsusb shows all USB devices. If all devices are connected, there should be 6 in total. The device with ID 1051:1000 is the printer (this ID applies only to the current printer model), and if it is shown it means that the printer is being detected.
To ensure that the printer is effectively being detected as a printer:
dmesg | grep lp0
[4.502138] usblp 1-1.4: 1.0: usblp0: USB Bidirectional printer dev 5 if 0 alt 0 proto 2 vid 0x1051 pid 0x1000
Finally, check if the service is running:
systemctl status bloom-printerservice.service
● bloom-printerservice.service - Bloom Printer Service
Loaded: loaded (/etc/systemd/system/bloom-printerservice.service; enabled; vendor preset: enabled)
Active: active (running)since Sun 2019-03-17 17:43:12 UTC; 16h ago
Main PID: 547 (printerservice)
CGroup: /system.slice/bloom-printerservice.service
47─547 / opt / qbetter / bin / printerservice
Mar 17 17:43:12 bloomtouch systemd [1]: Started Bloom Printer Service.
Mar 17 17:43:19 bloomtouch printerservice [547]: Printer PRT001 opened
Mar 17 17:43:20 bloomtouch printerservice [547]: Printer model: NP-2511D-2
Mar 17 17:43:20 bloomtouch printerservice [547]: Firmware version: Ver.3.02
Mar 17 17:43:20 bloomtouch printerservice [547]: Boot version: Ver.3.00
Mar 17 17:43:20 bloomtouch printerservice [547]: New connection accepted
Mar 17 17:43:20 bloomtouch printerservice [547]: Starting socket thread
Mar 17 17:43:20 bloomtouch printerservice [547]: Printer status changed to 1
Ticket dispenser’s logs
It is possible to consult the ticket dispenser’s logs as follows:
journalctl -u bloom-deviced -u bloom-printerservice
This command will open all the logs. If the -f flag is added, it is possible to follow the logs as they are created:
journalctl -u bloom-deviced -u bloom-printerservice -f
- Logs begin at Thu 2016-11-03 17:16:43 UTC. -
Mar 18 10:19:11 bloomtouch deviced [617]: [1B blob data]
Mar 18 10:19:11 bloomtouch deviced [617]: 12000
Mar 18 10:19:11 bloomtouch deviced [617]: [2019-03-18 10:19:11] [FINE] [com.qbetter.qbus.client.QbusClient $ WsClient onEvent] Received event: HeartbeatSignal
Mar 18 10:19:21 bloomtouch deviced [617]: [2019-03-18 10:19:21] [FINEST] [com.qbetter.qbus.client.QbusClient $ WsClient onMessage] Received message: Event
Mar 18 10:19:21 bloomtouch deviced [617]: ID: a2b68971-ac2c-4025-8a74-4604d7ca06ae
Mar 18 10:19:21 bloomtouch deviced [617]: Event: heartbeatsignal
Mar 18 10:19:21 bloomtouch deviced [617]: Content-length: 5
Mar 18 10:19:21 bloomtouch deviced [617]: [1B blob data]
Mar 18 10:19:21 bloomtouch deviced [617]: 12000
Mar 18 10:19:21 bloomtouch deviced [617]: [2019-03-18 10:19:21] [FINE] [com.qbetter.qbus.client.QbusClient $ WsClient onEvent] Received event: HeartbeatSignal
Player’s logs
The file ‘/opt/qbetter/logs/bloom-dsplayer.log’, available in each player, contains the player's logs, whereit is possible to detect connection failures or a failure at the player’s startup (corrupted configuration files, etc).
cat /opt/qbetter/logs/bloom-dsplayer.log
Devices migration
Whenever it is necessary to migrate devices from one location to another, the best procedure will be to access the device via SSH, perform a 'factory reset", and register the device in the new location.
Dispensador
- Access via ssh to the player
- run the command below
bloom-setup reset
- Wait for the process to end. In the end of the reset process, the SSH session will drop because the device will shut down.
- Turn on the device and follow the registration process.
Player
- Access via ssh to the player
- run the command below
bloom-reset -I
- Wait for the process to end. In the end of the reset process, the SSH session will drop because the device will shut down.
- Turn on the device and let him do the system configuration.
- Follow the registration process.
Servers
Processes
In the application and auxiliary servers, the processes required to serve the pages and write the logs in the database must be running. Check the processes:
systemctl status httpd haproxy logstash
● httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled)
Active: active (running) since Wed 2019-01-30 17:43:21 WET; 1 months 16 days ago
Docs: man: httpd (8)
man: apachectl (8)
Process: 4409 ExecReload = / usr / sbin / httpd $ OPTIONS -k graceful (code = exited, status = 0 / SUCCESS)
Main PID: 27225 (httpd)
Status: "Total requests: 0; Current requests / sec: 0; Current traffic: 0 B / sec"
CGroup: /system.slice/httpd.service
├─17430 / usr / sbin / httpd -DFOREGROUND
├─17743 / usr / sbin / httpd -DFOREGROUND
├─17746 / usr / sbin / httpd -DFOREGROUND
├─18050 / usr / sbin / httpd -DFOREGROUND
├─18152 / usr / sbin / httpd -DFOREGROUND
├─18156 / usr / sbin / httpd -DFOREGROUND
└─27225 / usr / sbin / httpd -DFOREGROUND
[...]
● haproxy.service - HAProxy Load Balancer
Loaded: loaded (/usr/lib/systemd/system/haproxy.service; disabled; vendor preset: disabled)
Active: active (running) since Wed 2019-01-30 17:43:21 WET; 1 months 16 days ago
Main PID: 27224 (haproxy-systemd)
CGroup: /system.slice/haproxy.service
27─27224 / usr / sbin / haproxy-systemd-wrapper -f / etc / haproxy / haproxy ....
├─27226 / usr / sbin / haproxy -f /etc/haproxy/haproxy.cfg -p / run / hapr ...
└─27227 / usr / sbin / haproxy -f /etc/haproxy/haproxy.cfg -p / run / hapr ...
● logstash.service - logstash
Loaded: loaded (/etc/systemd/system/logstash.service; enabled; vendor preset: disabled)
Active: active (running)since Tue 2019-01-22 04:40:28 WET; 1 months 24 days ago
Main PID: 537 (java)
CGroup: /system.slice/logstash.service
5─537 / usr / bin / java -XX: + UseParNewGC -XX: + UseConcMarkSweepGC -XX: C ...
Active: active (running) since Tue 2018-09-25 21:04:32 WEST; 5 months 21 days ago
Main PID: 56773 (java)
CGroup: /system.slice/logstash.service
+ -56773 / usr / bin / java -XX: + UseParNewGC -XX: + UseConcMarkSweepGC -XX: CMSInitiatingOccupancyFraction = 75 -XX: + UseCMSI ...
To start/restart the processes:
sudo systemctl restart httpd haproxy logstash
Auxiliary server extra processes:
systemctl status bloom-bst bloom-core bloom-proxy bloom-qbus
● bloom-bst.service - Q-Better bst daemon
Loaded: loaded (/etc/systemd/system/bloom-bst.service; enabled; vendor preset: disabled)
Active: active (running)since Tue 2019-01-22 04:40:41 WET; 1 months 24 days ago
Main PID: 880 (bst)
CGroup: /system.slice/bloom-bst.service
80─880 / home / centos / qbetter / bin / bst -c / home / centos / qbe ...
● bloom-core.service - Q-Better bloom-core daemon
Loaded: loaded (/etc/systemd/system/bloom-core.service; enabled; vendor preset: disabled)
Active: active (running) since Tue 2019-01-22 04:40:41 WET; 1 months 24 days ago
Main PID: 877 (bloom-core)
CGroup: /system.slice/bloom-core.service
├─877 / home / centos / qbetter / bin / bloom-core -c / home / cen ...
├─882 / bin / bash -c java -Dfile.encoding = UTF-8 -jar -Dj ...
└─884 java -Dfile.encoding = UTF-8 -jar -Djava.library.p ...
● bloom-proxy.service - Q-Better bloom-proxy daemon
Loaded: loaded (/etc/systemd/system/bloom-proxy.service; enabled; vendor preset: disabled)
Active: active (running) since Tue 2019-01-22 04:40:41 WET; 1 months 24 days ago
Main PID: 883 (bloom-proxy)
CGroup: /system.slice/bloom-proxy.service
└─883 / home / centos / qbetter / bin / bloom-proxy -c / home / ce ...
● bloom-qbus.service - Q-Better Q-Bus daemon
Loaded: loaded (/etc/systemd/system/bloom-qbus.service; enabled; vendor preset: disabled)
Active: active (running) since Tue 2019-01-22 04:40:41 WET; 1 months 24 days ago
Main PID: 881 (qbus)
CGroup: /system.slice/bloom-qbus.service
88─881 / home / centos / qbetter / bin / qbus -c / home / centos / qb…
The bloom-qbus process is where all devices connect via socket. If this process is not running, the system will not function properly. Restarting this process causes a "burst" in the application servers, as all devices will again request for configurations.
The bloom-core process is responsible for consolidating data and scheduling tasks such as sending reports, resetting daily tickets, “cleaning” appointments, managing devices status, etc. This process can be restarted without causing downtime in the system.
The other two services are auxiliary and can be restarted without impact.
Logs
The logs for these services can be found in
home/bloom/qbetter/logs
There it is also possible to find the Apache access logs and error logs, The logs of the webserver (Apache2), can be found in
home/bloom/qbetter/logs/apache
The logs described below are the ones available in the system’s backoffice (pages Logs > Events and Logs > Actions), and that can be downloaded in CSV format. This should be the easiest way to consult them.
Contain all changes made via the backoffice, user logins and logouts, interactions with SDB and ID Access, SMS's and emails, and queueing events (generated tickets, served tickets, tickets reset).
home/bloom/qbetter/logs/bloom-php.log
home/bloom/qbetter/logs/bloom-php-actions.log
The logs for the ‘bloom-core’ service can be found with the name below. Files, being the most recent the ‘javaEvents.log.0’ and the oldest the ‘javaEvents.log.1’. These logs contain information about data consolidation and scheduled tasks mentioned above.
home/bloom/qbetter/logs/javaEvents.log.*
Web server
The addresses of the status pages are:
Apache
http://<app_server>:1434/stats
Haproxy
http://<app_server>:54322/server-status
HAProxy's asks for authentication, and the username is ‘admin’ and the password is ‘password’.