Original Question:
> We're having a problem with backups dying after erroneously reporting
> that a tape is full.  We're running Solaris 7 with the Solaris Data
> Backup Utility from Easy Access Server 2.0 on an Ultra 2.  Our tape
> drive is an Exabyte 8505 XL OEM'd as Sun model 611, part no.
> 599-2035-02.  The tape drive is last in the SCSI chain, with a 52G
> Andataco GigaRAID/SX before it.  Previously the RAID array was last in
> the chain.  After moving to the new configuration, the problem takes
> longer to occur, but backups still fail.  We previoulsy used this tape
> drive on an SS10 with Solaris 2.6 and Solstice Backup 4.2.6 with no
> problems.  Below is the syslog output generated during a failed backup.
> Any help would be greatly appreciated.
We still don't know exactly what is going wrong with the original
configuration, but simply putting the tape drive on a different Ultra 2
has worked, so we will be installing another SCSI bus on our server to
run the RAID and tape separately.
Here is what some people told us might be wrong/might help to do:
----------------------------
1. SCSI chain is too long. 2m (2 yards) for SCSI-2, 1m for SCSI-3
2. Faulty cable or connection
3. The tape media is dead (or dying) and should be replaced.
4. The tape drive just died.
----------------------------
These problems are difficult to pinpoint without some experimenting.
Almost always, "SCSI transport failed" is the result of SCSI spec
being violated because of poor cabling, poor termination, or
cable length.  
Since you switched the order of the devices on the chain, did you
introduce a new cable, and did it lengthen the chain length?  You
also introduced a new terminator to the chain (now on the 8505).
Either of these may be poorer quality.  Your RAID device appears
to be Fast wide - you tape device is fast narrow (true?).  The
cable between the wide and narrow devices must terminate the high-9
pairs.  Not all 68-50 pin cables do this properly.  (Chad: The tape
drive is actually a wide device.)
A helpful tool "scsiinfo" can also help identify problems with SCSI
devices (search scsiinfo on internet).
----------------------------
Solaris Data Backup Utility is rebadged Legato Networker.  The error
messages indicate a write failure on the tape drive and Networker
interprets that as "end of tape".
The write error may come from a number of things:
   SCSI chain too long
   Dirty heads on the 8505
   Failed 8505
   Low quality tape media
I don't know if SDBU ships the complete set of Networker executables,
but
you might try looking for this cool little program called "tapeexercise"
which will torment your drive to a fair-thee-well -- it's very thorough.
(Chad: tapeexercise succeeded)
...
(BTW: There is a mailing list for Networker issues:
Subscription is via majordomo and there is a digest available as well.)
----------------------------
You have SCSI interface problems between the scsi controller
and the old scsi-2 Exabyte 8505.
Try putting the tape drive before the raid array.
Try putting the tape drive last, but use a "perfect I/O" terminator.
Play with the scsi options, /kernel/drv/isp.conf,
name="isp" parent="/fas@xx,xxxx/fas@x unit-address="5"
scsi-options=0x1f8;
-----------------------------
And the winners (sort of):
>From Richard Smith:
Can you isolate the tape drive on another SCSI adapter apart from the
RAID?  Keep the SCSI length as short as possible (2m is a practical
max for F/W devices).
>From Mark Ashley:
I'm surprised you haven't given the RAID it's own SCSI bus so that it
can
clip along at full speed without the bus being crippled to talk to the
tape drive. I suggest another interface for the slower devices.
-----------------------------
Thanks! to:
David Evans
Mark Ashley
Richard Smith
Reto Lichtensteiger
Bismark Espinoza
-- Chad Campbell Software Engineer, Innovision Corporation Chad.Campbell@innovision.com (913)226-8700
This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:13:15 CDT